An Ontology Editor for Android. 5. Mobile Consideratons - Power

 Source: Wikipedia, Public Domain Licence

Source: Wikipedia, Public Domain Licence

In my previous post I showed the comparative ease with which the execution times of methods can be monitored using the Timer class of the Android logging framework. Apart from monitoring the time, the next obvious question on a mobile device is: how much power does my application/thread/instruction consume on execution? 

Monitoring power consumption is, of course, not only interesting for my little application and ontology loading, but also for on-device data processing and reasoning. Particularly in the context of the internet-of-things, this will be very important: constantly sending data back and forth between a device and a server for data processing and retrieval of processed data will soon explode any network capacity we may have. On-device processing is one way of mitigating against this.

 So I went hunting for some information concerning how power consumption can be monitored on Android devices and was somewhat disappointed with what I found. The bottom line is: there is nothing provided by the Android framework itself. Functionality along the lines of "Settings > ... > Battery Usage" are not programatically available via the official APIs.

There are a number of open-source and academic applications such as PowerTutor, but on the whole these seem to suffer from unreliable support and in many cases haven't been maintained for too long or are device specific.

if your Android device happens to run on a Qualcomm Snapdragon processor, you may be able to use the Trepn power monitoring tool provided by Qualcomm both as a .apk and a plugin to Eclipse. The only Android device I currently have is the 2012 Nexus 7, which runs on an NVIDIA Tegra 3 processor, though the 2013 model does indeed have a Snapdragon.

The most interesting tool I have seen so far - and I haven't had a chance to test it - is a commercial tool by Little Eye Labs, called - surprisingly - "Little Eye". The tools is free to evaluate for 30 days, but after that, will cost 50 USD PER MONTH for a single pro licence making it prohibitively expensive for a hobby programmer.

I haven't found much else - as always I am grateful for any comments and pointers to other solutions.

An Ontology Editor for Android. 4. Mobile Considerations - Ontology Load Times

In my last post, I showed the beginnings of an ontology editor for Android come into existence. When I developed this first version, I did so mainly against relatively small ontologies, such as the General Formal Ontology, the Basic Formal Ontology and others.

Once this was stable, I tested in particular the loading of ontologies without import statements (which need to be resolved over the web) from a location on the device. As I was part of the ChEBI team for a short while, it was natural to try and use ChEBI.

Well, I was in for a rude shock. Once I had selected the ontology and started to load - well - nothing happened. The application didn't crash and the developer console clearly indicated that it was running - and more importantly, that the memory allocated to this process was expanding and expanding. So I stopped the loading and decide to dig into this a bit more. 

Logging on Android

Android has a rather wonderful logging framework built in and getting hold of the data was a breeze using the Android TimingLoggerClass:

TimingLogger timings = new TimingLogger("LOAD", "ontology loading");

ontology = manager.loadOntologyFromOntologyDocument(in);
timings.addSplit("LoadTime");
timings.dumpToLog();
Log.d("LOAD", "loading finished");

The output from the logger will then show up in the LogCat console.

 

Ontology Load Times

To get a better handle on loading behaviour I started to look at load times for ChEBI (can't remember which version - this is not a completely scientific experiment), GFO and DOLCE Light. I loaded each ontology from the shared storage area on the device into the application and observed the load process. The experiment was repeated 10 times for each ontology. The link to the data is here. The bottom line is: GFO and Dolce-Light with its 78 and 37 classes respectively had reasonably similar load times (within standard deviation): 483 (+/- 38) ms and 540 (+/-35) ms. Still somewhat surprising as Dolce-Light is the smaller ontology and should have loaded faster (but maybe there are other factors at play here). ChEBI by contrast required a whopping 370391 (+/-7039) ms - i.e. about 6 minutes - to load. The device didn't crash and I could explore the class list just fine. I did, however, have to remove memory restrictions on the application and allow OntoDroid to just take over whatever device memory it could get hold of.

This is a problem which - in time - will probably solve itself as devices get more powerful. Until then, however, it has got consequences for how to program - OWLOntology is not serialisable and hence cannot just be passed around via Intents.

More about this later.

 

An Ontology Editor for Android. 3. It's here.

Due to the usual combination of distractions caused by work and a significant amount of travel, I haven't blogged on this as much as I wanted to recently. Still, the ontology editor project for Android is progressing and a first version is here. There are bugs still (of course) and design wise it is pretty squidgy round the edges (more material for another blog post), but it does many of the things I set out as specifications in my previous post.

Here are some screenshots that show the current progress.

 

 The Start Screen.

The Start Screen.

 Loading an ontology from a shared location: the Downloads folder.

Loading an ontology from a shared location: the Downloads folder.

 Displaying a list of ontology classes. Here, the  General Formal Ontology (GFO)  has been loaded.

Displaying a list of ontology classes. Here, the General Formal Ontology (GFO) has been loaded.

 A detailed view onto a class and the ability to add subclass/superclass relationships.

A detailed view onto a class and the ability to add subclass/superclass relationships.

 Editing a class.

Editing a class.

 Creating a new ontology - defining IRIs and file name. From there it goes into the editor shown above.

Creating a new ontology - defining IRIs and file name. From there it goes into the editor shown above.

I will work on this further, squat some bugs and introduce more functionality. I have also run into a number of problems relating to the mobile form factor and computing power of mobile devices, but that merits several separate posts.

Some questions that immediately arise from these screenshots are:

  • Mobile devices have limited screen real estate. What is the best way of displaying complex and multi-level hierarchies?
  • How to display non hierarchical relationships?
  • What do editor user interfaces need to look like, when asserting axioms on a class (clearly, the complex UIs offered by Protege are not appropriate) ?

It would be interesting to kick off that discussion and see what the community thinks. Feedback and comments are always welcome.

 

An Ontology Editor for Android - Defining a first set of specs

In my previous post, I have discussed some of the motivation for wanting to write an ontology editor - or at least the beginnings of one - for the Android platform. In this post, I want to discuss how I got started.

Before writing code, I actually sat down and spec'ed out the first bits of functionality that I wanted to get to before releasing some code. A 0.1 version of the ontology editor - working title "ontodroid" - should have the following functionality:

  • ontodroid should be able to create a new ontology from scratch
  • ontodroid should be able to load an existing ontology from a predefined shared location on the android device - for the first version I have decided that that should be the "Downloads" folder. In future iterations, Dropbox and Google Drive functionality should be added.
  • ontodroid should be able to list all non-anonymous classes in the ontology in a list view: no hierarchical representation at this stage
  • ontodroid should display all asserted superclasses and subclasses of a class selected from the class view
  • ontodroid should allow the assertion of new subclass and superclass relationships
  • ontodroid will not deal with ontology imports in this iteration
  • ontodroid will not allow the user to define new relations or relate classes via relations in this iteration

I even did some wireframes using the excellent Balsamiq tool (can be used for free from their website), but I suspect that by the time some of this is implemented, it'll look nothing like the wireframing - so I am not going to even post them here.

Time to get hacking.... 

 

Version Control for the Lone Scientist Part 2 - BiomedCentral

Following on from my first blog post, version control and using the social aspects of distributed version control systems for science seems to be a general hot topic at the moment. BioMedCentral just announced over on their blog, that they have teamed up with the folks from Github in order to better understand how scientists use it to go about their science and what some of the use-cases are and what crystallises as best practice.

They have promised to blog about their findings on the BiomedCentral and Github blogs and that should make for some interesting reading in the future.

The post also contains a link to a paper [1] (open access) in Source Code For Biology and Medicine discussing how git and version control systems can lead to greater reproducibility. However, skimming the paper, it seems to be mainly a position paper - it would be wonderful if someone could undertake a full scientific study as to whether tools like git really do lead to more reproducible computational experiments.

[1] K. Ram, Source Code for Biology And Medicine (2013) 8:7 (

doi:10.1186/1751-0473-8-7)

Version Control for the Lone Scientist

I am an unabashed fan of distributed version control systems (DVCS) such as Mercurial or Git. And from time to time, I get drawn into discussions with friends and colleagues about the pros and cons of these.

One question in particular comes up time and time again: DVCS are complete overkill for the lone coder and - by extension - for the lone scientist. Here are some thoughts on version control for computational scientists, working alone or collaboratively.

In summary, I think one can get at this from several angles: (a) management of change and a state of mind, (b) the reproducibility of the computational experiment, (c) showcasing yourself as a researcher/hacker and novel measures of impact. These all overlap to a certain extent.

Management of Change and a State of Mind

This is the obvious one. Version control systems manage change – that is the trivial and obvious thing to state. But they do it in very different ways.  Version control systems such as CVS or Subversion are in essence feudalistic models of working: a central server holds a canonical version of an artifact (source code, an ontology, a piece of writing), which gets pushed to clients. Because of the feudalism, this means that “commit” equals “inflict”: someone commits a change to an artifact and it gets inflicted on al the clients working with the same repository. So what are/were the consequences of this? No atomic commits (I realize that the discussion as to whether atomic commits are a good thing or whether even broken code should get checked in is one for an evening in the pub), code hardly ever got checked in.

Contrast this to distributed version control sytems. Here, there is a staging system. Code exists in the repository on your machine and you develop on this. Code may also exist on another machine/server/host such as Bitbucket or Github, which may or may not hold a canonical version. In any case – commit here isn’t inflict because it takes a “push” operation to add the code you may have just committed to your local repository to the remote one. Furthermore, the commit into your local repository is not coupled to a push and hence “commit” is not the same as “inflict”.

Typically, the results are more commits at least locally and the preservation of work. And this makes sense even for the lone developer.

Another aspect in this discussion concerns the way in which changes are tracked by these systems: subversion and others of a similar ilk track versions – changes in the file system – whereas git and Mercurial track what has actually been changed. Again, an almost trivial statement to make, but it has huge implications. Merging becomes much, much, much easier that way – resulting in more branching, more commits, more experiments. That’s a good thing – particularly as a scientist. Much work in computational and data science involves parameter sweeps – running the same protocol again and again – but with altered parameters each time this is done. Developing workflows and computational procedures quite often require experimentation – starting from a baseline script, branching, making changes, merging these back, branching again, experimenting etc…and the commit and branching mechanisms in version control systems can be used to track and document these experiments: it is a step towards reproducible computational science.

It is also a state of mind: the staging involved in push and pull mechanisms in addition to a commit enables distributed and therefore massively collaborative working. And sooner or later even the loneliest of lonely scientists will have to engage in this way of working, if he or she wants the world to acknowledge and take up the work that has been done. The way of working is so powerful, that software development tools such as git and mercurial are now also used to author legislation (http://www.quora.com/Ari-Hershowitz/Posts/Hackathon-Anyone-Recode-Californias-Laws), to distribute information (German law, for example, is available on Github and even to figure out http://www.wired.com/wiredenterprise/2013/01/this-old-house/. Bottom up, massively collaborative ways of working are ways of working of the future – distributed version control systems are one embodiment of this mindset.

(b) The Reproducibility of the Computational Experiment

This is picking up the discussion begun in the previous point. When taking version control systems and combining them with conventions around organizing the other components of, for example, bioinformatics projects, we might be able to tackle issues of reproducibility of computational experiments/investigations a bit better. There has been some discussion around this on Biostars and also in the literature, most notably in a paper by Noble about organizing Computational Biology Projects and our Lensfield Paper from a while back. Version control here fulfills three functions (a)  backup, (b) the keeping of a historical record of work done and (c) enabling concurrent work by multiple collaborators, which may sooner or later happen to even the lonliest of scientists.

(c) Showcasing yourself as a hacker/developer/bioinformatician/scientist/whatever

Apart from the possibility of working massively collaboratively, a whole social ecosystem has sprung up around these tools.  There’s the obvious: Github is integrated professional social networks such as LinkedIn and serious job websites such as Stack Overflow Careers. These integrations give hackers and scientists the opportunity to showcase themselves in completely new ways. Ask yourself: if you were an employer and were looking for a new bioinformatician/scientist/hacking or developer and you had to choose between an applicant who (a) sends you the standard cover letter/CV combo or (b) someone who – in his cover letter tells you where his/her code can be found on github/bitbucket thus allowing you to inspect it, who has a a Stackoverflow profile where they have answered technology questions and their answer has been peer reviewed by their peers and they have accrued reputation and standing? I know which candidate I would be much more interested in. Clearly using git allows you to tap into this ecosystem. There is no technical reason why this could not happen on something like SVN etc….practically though, the ecosystem is not there.

The other aspect is social: Github has many social components – and thereby signals which can be used for metrics. This, in turn, has knock-on effects on developing measures of impact: new metrics systems such as ImpactStory, for example, will track not just your papers, citations etc, but also your open source contributions via Github, the number of commits, followers, forks etc – it becomes one signal in a more complete picture of the impact of a scientist/coder/engineer than just traditional paper metrics.

The downside of all of this is, that, in a way, it almost condemns you to participation. But I suspect that this is the direction that knowledge work will take anyway – everything we do will become increasingly social. And, of course, it will become a significant career problem for those who don’t want to participate in these systems or can’t because of, for example, institutional constraints.