#7 GSoC’18 – First evaluation

This week was the first evaluation for GSoC program. I got great feedback from my awesome mentors and cleared 1st phase with flying colors. Horray! As for the work, this week was mainly solving issues with the features I already added in the past few weeks and refactoring code with help of Matthias.

Adding JUnit test for SBSCL

SBSCL was refactored, recently, as per maven standards and this made the overall build and test process fluidic. Several JUnit tests were added to the src/test/java folder which can check existing features. After adding all the tests, the folder looks like this:

 org/simulator/
 |- sbml -> Contains test files for SBMLTestSuite and individual SBML models
 |- sedml -> Contains test files for SED-ML L1V1, L1V2 and L1V3
 |- comp -> Contains test files to run a hierarchical SBML models
 |- omex -> Contains test files to run an OMEX archive
 |- fba -> Contains test files to run constraint based models

Each JUnit test refers to a resource file which can be found in src/test/resources. Currently, we have a few test resources of each type while big resource files such as SBMLTestSuite and BiGG models are downloaded during runtime using Travis.

Continuous integration with Travis

Continuous integration with Travis was the obvious next step. Travis is a great online GitHub app which can test your code online for every commit. This is especially useful when all the developers are using different operating systems since current code might build locally but break when others try it. Now, working-branch of SBSCL is integrated with Tavis and after every commit, Travis automatically verifies the build and runs all the JUnit tests.

To integrate Travis in your repo, simple add .travis.yml file in the home directory of your repo, define the language and version. Simple! In our case, we use Java as our programming language and maven to build our tool, therefore, a simple file looks like this:

language: java

jdk:
– oraclejdk8

install:
– cd src/test

script:
– mvn clean install

Using a new data structure for dataGenerator with repeatedTasks

Our current work is related to repeated tasks and SED-ML L1V2 support. After adding support in SBSCL to run repeated Tasks, all the code following that has to be updated with different a data structure.

Currently, all our code is on the working-branch and once all the current SED-ML issues are addressed, we will do our first merging with the master branch.

 

Advertisements

#6 GSoC’18 – Restructure SBSCL

This week’s work mainly includes a comprehensive refactoring of the codebase. Originally, SBSCL used Ant scripts for the build process and therefore the current directory structure. However, over time the standards changed and SBSCL’s build process moved to maven. I have talked about maven in a few blog posts already when I try to introduce the build process. I have also talked about the issues faced to build SBSCL using maven. Most of these were because SBSCL’s codebase was not organized as per maven standards.

Before restructuring this is how the code was organized:

 /
 |- dist        -> Contains a JAR file of the library
 |- doc
    |- api      -> JavaDoc including examples for usage of the library
 |- lib         -> 3rd party libraries needed for compilation and execution
 |- licenses    -> License agreements of all 3rd party libs and a list of 
 |                 authors of this library
 |- resources   -> A source folder containing required resource files.
 |- src         -> The main source folder containing all Java files and the 
 |                 overview.html providing a brief overview of the project.
 |- test        -> Source code for testing, including BioModels and SBML Test
 |                 Suite
 |- build.xml   -> an Apache ANT script which compiles the source code and
 |                 provides several options to create distribution files.
 |- LICENSE.txt -> the license, under which this project is distributed
 |- pom.xml     -> Maven support for the project
 |- README.txt  -> this file

There were multiple folders in the original organization which has been changed to move all the code inside src folder. As per maven standards, all the code should be inside src/main/java/ and JUnit test should be in src/test/java while the resources for these should be in src/main/resources and src/test/resources respectively.

This restructuring is especially helpful since simple maven run will not only compile the code, but it will also execute all the JUnit tests. Following command can be used for a clean maven install:

mvn -U clean install

Currently, most tests inside src/test/java are not JUnit files. Instead, they have a main class and so they won’t be automatically executed. Upcoming work will be to convert these to JUnit tests.

Let me know if any questions or issues.

#5 GSoC’18 – Plot support in SBSCL

This week I implemented a simple interface for output data-plotting. SBSCL has historically been a command-line only library and therefore it never had any data-plot feature.

To implement plot support, I used existing Java library JFreeChart. As of now, there is an extra class PlotMultiTable that takes in a MultiTable data structure as input and plots all the columns of MultiTable. MultiTable is custom AbstractTableModel like data structure which is used in SBSCL to store and transport information.

To plot output generated from your biological model simply write the following:

 // plot all the reactions species
 PlotMultiTable p = new PlotMultiTable(solution, "Output plot");
 p.pack();
 RefineryUtilities.centerFrameOnScreen(p);
 p.setVisible( true );

PlotMultiTable class takes a MultiTable as input and generates an output plot with the title “Output plot”. If you are interested in plotting a subset of all the species then you can filter MultiTable before passing it to PlotMultiTable.

 // plot all the reactions species
 PlotMultiTable p = new PlotMultiTable(solution.getBlock(1), "Output plot");
 p.pack();
 RefineryUtilities.centerFrameOnScreen(p);
 p.setVisible( true );

This code will only plot the second column from the MultiTable.

You can also filter which timepoints to display by applying this filter:

solution.filter(new double[]{t1, t2, ... tN})

Let me know if you have any questions.

#4 GSoC’18 – Repeated tasks

This week I have been working on implementing support for RepeatedTasks in SBSCL which was added in SED-ML L1V2. Briefly, RepeatedTaks is a looping construct added to SED-ML to run the same tasks multiple times. Before every iteration, there is also an option to reset model parameters or update them similar to a looping construct in any programming language.

I have only partially implemented it. There are two challenges that I need to overcome: a) Handling nested repeatedTask, and b) Merging the output of repeatedTasks.  I will be discussing possible solutions to address these issues with my mentors soon and implementing them.

An extra feature that I worked on adding to SBSCL was support for OneStep and SteadyState simulations. These elements were also added to SED-ML L1V2. Basically, an ODE simulation can be UniformTimeCourse, OneStep or SteadyState. The first type of simulation was already implemented in SBSCL while the other methods need to be incorporated. I have already added OneStep simulation since the idea is simple: only generate one simulation point after given time. The steadyState simulation is something which will be implemented in the upcoming week after discussing the potential solutions with my mentors. In layman’s terms, SteadyState simulation means simulate till a point when population stops changing.

I will keep posting updates as and when I have. Till then, have a good one!

#3 GSoC’18 – Coding begins

The coding phase for Google Summer of Code (GSoC) started on May 14th and since then I have spent day and night just reading, understanding and writing code for SBSCL.

Build process:

I talked about build process last week but this week I tried something fancy! This week, I tried building SBSCL using maven instead of m2e plugin directly from command-line. You might think that since everything works fine in eclipse, it should also be simple with mvn but that’s not the case. To cut things short, I recommend using Eclipse as the build environment for SBSCL and since the latest version comes with m2e already, the build process is a piece of cake. All you have to do is right-click > run as > maven build (or maven install).

  1. import java folder in eclipse
  2. right click on the project and Run As > maven build

If you get dependency missing errors, you need those jar files to build. Most jars files required by SBSCL are available in maven central repo except – cplexGLPKSolver, lpsolve and  SCPsolver. Note that sometimes maven might give error: unknown build environment. This might be if you are using JRE instead of JDK. You can solve this by downloading JDK or retrying to apply run as command.

Once you have all the depdency JAR files with you, they need to be added to lib/{dependency_name}. You will find empty folders inside lib folder already created for you to place the jar files.

For those of you who still want to try maven to build it, you will have to edit pom.xml file to include all the source folders to classpath and then run following command to create a standalone jar.

mvn clean package install assembly:single

After build process, the standalone jar file is created in the target folder.

Tests for SBSCL:

Moving on, I focused mainly on trying to run several TestSuites to check existing code of SBSCL. SBML community has a wonderful archive with several models which can be used to test your code. You can download a zip file from their GitHub page. There is also a BioModels database which contains models from several studies. This database is also a wonderful resource to test simulation software.  Similarly, there is an online model database for constraints-based models by BiGG and you can download a zip file containing all the models.

Extending SBSCL to support SED-ML L1V2:

After trying all the examples and testSuites, I started learning more about SED-ML Level 1 Version 2 since SBSCL needs to support repeatedTasks and Range which were added in SED-ML Level 1 Version 2. One interesting thing that I wasn’t able to find was a full testSuite of SED-ML examples similar to SBML. Anyway, I am still working on this task and will post some updates soon.

 

#2 GSoC’18 – Community bonding

The community bonding phase (of GSoC) is a brief period before the beginning of the actual coding phase where students get to know the organization they are selected for works. I had my first (Hangouts) meeting with my mentors Andreas Dräger, Matthias König and Nicolas Rodriguez where we discussed how NRNB works and what they expect from me as a student. It was exciting to learn about setting up the development environment for the simulation core library (SBSCL) and some other resources such as Github Projects which I will use to setup individual milestones.

Setting the development environment for SBSCL:

Download and install Eclipse IDE (latest version is Oxygen) and git. Once you have Eclipse and git on your computer, simple clone the repo using git clone <repo_name>. As per my discussion with mentors, we decided that I fork SBSCL into my Github before cloning it to be safe. This way, I can play around with the code without worrying about the master branch. Once I cloned the repo, I imported it in Eclipse. Since SBSCL uses Maven for build process, it was pretty easy to setup.

Some of you might run into the problem: Error- Unable to find cplex-12.6.1.0.jar and this is because SBSCL’s constraint-based model solver requires IBM’s proprietary linear programming solver library. Fortunately, IBM provides a free student license with the university email and therefore I was able to download jar file and import it as a dependency. Note that current cplex version is 12.8.1.0 and not 12.6.1.0. I wasn’t able to find 12.6 online so I modified the pom.xml file to expect for 12.8.1.0. This worked like a charm! A great thing is that one of my GSoC tasks is to remove this dependency (by using open-source API like SCPSolver) so everyone can use linear programming feature. I will update about it once I finish implementing it.

Workflow of using the simulation core library

The library contains a folder examples which has simple codes to read, parse and simulate a systems biology model. You can write your own SBML model or download some samples from here – http://www.ebi.ac.uk/biomodels-main/. There are a lot of models there and you can simply try downloading model of the month. Once you have your model you can validate it here – http://sbml.org/validator/

Now, you know that you have a valid SBML model you can try making a new java project and try running the SimulatorExample.java file as your main file for the project. Voila! You will get output of the model as a MultiTable which can be plotted using some Java plotting library (also one of my GSoC tasks).

You can also try running CPLEX solver (if you have license) by using COBRASolverExample.java file found in the example folder.

Remember that you need to import the jar dependencies of simulation core, jlibsedml, commonMath etc. All the dependencies are written in the HelloWord code documentation page of SBSCL which can be found inside doc folder.

Wrapping up, I also found a glitch in the README.md filed which says there are multiple jar files in the dist folder. I changed it and pushed my first commit. Yay! It is in my forked Github though.

Let me know any questions or suggestions in comments 🙂

Getting started

This summer I will be working with a fantastic open-source organization National Resource for Network Biology (NRNB) thanks to Google for accepting my Google summer of code (GSoC) proposal titled “Simulating systems biology models in Java”. The official coding doesn’t begin until May 14th, however, I am currently getting familiar with the open-source practices and my build environment. I have been assigned, three wonderful mentors namely Andreas Dräger, Matthias König and Nicolas Rodriguez who will guide me throughout this process.

This blog is where I intend to share my progress. I look forward to an exciting and productive summer.

What is GSoC?

Image result for gsoc

Google Summer of Code is a global program focused on bringing more student developers into open source software development. Students work with an open source organization on a 3 month programming project during their break from school. You can know more about it by visiting http://g.co/gsoc

What is NRNB?

nrnb logo

The aim of the NRNB is to advance the new science of Biological Networks through analytic tools, visualizations, databases and computing resources. Biomedical research is increasingly dependent on knowledge of biological networks of multiple types and scales, including gene, protein and drug interactions, cell-cell and cell-host communication, and vast social networks. Our technologies enable researchers to assemble and analyze these networks and to use them to better understand biological systems and, in particular, how they fail in disease. You can learn more about it by visiting http://nrnb.org/