OQGraph – bazaar adventures in migrating to git

Jan 27 2014

Background

I have been acting as a ‘community maintainer’ for a project called OQGraph for a bit over a year now.

OQGraph provides graph traversal algorithms over SQL data for MariaDB.  Briefly, it is very difficult if not impossible to write a SQL query to find the shortest path through a graph where the edges are fairly typically represented as two columns of vertex identifers.  OQGraph as of v3 provides a virtual table that allows such an operation via a single SQL query.  This thus provides a simple mechanism for existing data to be queried in new and novel ways.  OQGraph v3 is now merged into MariaDB 10.0.7 as of 16 December, 2013.

Aside: I did a talk [1][2][3] about this subject at the Linux.conf.au OpenProgramming miniconf. I really didn’t do a very good job, especially compared with my SysAdmin [4][5][6] miniconf talk; I lost my place, “ummed” way too much, etc., although audience members kindly seemed to ignore this when I talked to them later :-) I know I was underprepared, and in hindsight I tried to cover way too much ground in the time available which resulted in a not-really coherent story arc; I should have focused on MTR for the majority of the talk. But I digress…

Correction: I also had a snafu in my slides; OQGraph v3 supports theoretically any underling storage engine, the unit test coverage is currently only over MyISAM but we plan to extend it to test the other major storage engines in the next while.

Launchpad and BZR

MariaDB is maintained on Launchpad. Launchpad uses bazaar (bzr) for version control.  Bazaar already has a reputation for poor performance, and my own experience using it with MariaDB backs this up.  It doesn’t help that the MariaDB history is a long one, the history converted to git shows 87000 commits and the .git directory weighs in at nearly 6 GBytes!  The MariaDB team is considering migration [7] to github, but in the meantime I needed a way to work on OQGraph using git to save my productivity, as I am only working on the project in my spare time.

Github

So heres what I wanted to achieve:

  1. Maintain the ‘bleeding edge’ development of OQGraph on Github
  2. Bring the entire history of all OQGraph v3 development across to Github, including any MariaDB tags
  3. Maintain the code on Github without all of the entirety of MariaDB.
  4. Be able to push changes from github back to Launchpad+bzr

Items 1 & 3 will give me a productivity boost.  The resulting OQGraph-only repository with entire history almost fits on a 3½inch floppy! Item 1 may help make the project more accessible in the future.  Item 2 will of course allow me to go back in time and see what happened in the past.  Item 3 also has other advantages: it may make it easier to backport OQGraph to other versions of MariaDB if it becomes useful to do so.  And item 4 is critical, as for bug fixes and features to be accepted for merging into MariaDB in the short term it is still easiest to maintain a bzr branch on launchpad.

To this end, I first created a maintenance branch on Launchpad: https://code.launchpad.net/~andymc73/maria/oqgraph-maintenance. I will regularly merge the latest MariaDB trunk into this branch, along with completed changes (tested bugfixes, etc.) from the github repository, for final testing before proposing for merging into MariaDB trunk.

Then I created a standalone git repository.  The OQGraph code is self contained in a subdirectory of MariaDB, storage/oqgraph . The result should be a git repository where the original content of storage/oqgraph is the root directory, but with the history preserved.

Doing this was a bit of a journey and really tested out by git skills, and my workstation!  I will describe this in more detail in a subsequent blog entry.

The resulting repository I finally pushed up to github and can be found at https://github.com/andymc73/oqgraph. I also determined the procedure for merging changes back to Launchpad, see the file Synchronising_with_Launchpad.md

 

[1] https://lca2014.linux.org.au/wiki/index.php?title=Miniconfs/Open_Programming#Developing_OQGRAPH:_a_tool_for_graph_based_traversal_of_SQL_data_in_MariaDB

[2] Video: http://mirror.linux.org.au/linux.conf.au/2014/Tuesday/139-Developing_OQGRAPH_a_tool_for_graph_based_traversal_of_SQL_data_in_MariaDB_-_Andrew_McDonnell.mp4

[3] Slides: http://andrewmcdonnell.net/slides/lca2014_oqgraph_talk.pdf

[4] http://sysadmin.miniconf.org/presentations14.html#AndrewMcDonnell

[5] Video: http://mirror.linux.org.au/linux.conf.au/2014/Monday/167-Custom_equipment_monitoring_with_OpenWRT_and_Carambola_-_Andrew_McDonnell.mp4

[6] Slides: http://andrewmcdonnell.net/slides/lca2014_sysadmin_talk.pdf

[7] https://mariadb.atlassian.net/browse/MDEV-5240

No responses yet

On Linux.conf.au and Presentation logistics

Jan 06 2014

How to diff LibreOffice Impress presentations in git

These days I keep track of presentation material using git (like many things). One handy trick is you can `git diff` an ODP file and have the text changes show like any other code.

The following procedure will enable this for Debian, YMMV.

  1. sudo apt-get install odt2txt
  2. Edit the file .gitattributes in the repository containing the ODP files. Note that this file is interpreted at each level in the directory structure
  3. Add the following line to the file:
    *.odt diff=odt
  4. You should probably commit .gitattributes as well.

I haven’t yet worked out a way to make this global using ‘git config’ which is a little annoying.

Another quick tip,  to redock the ‘slides’ pane if you manage to set it “loose”, hold CTRL and double click the word ‘slides’ at the top _inside_ the pane window.

And lastly, if anyone can tell me how to get Impress Presenter Mode to actually work with xrandr without obscuring the actual presentation that would be awesome…

Managing to avoid making last minute tweaks to a presentation is another problem not solvable with code ;-) The problem is I always think of improvements each time I re-read my talk, and also once the conference has started I always think of good techniques to try after watching the other speakers…

 

(One miniconf talk down, one to go!)

No responses yet

Quick Tip – AspireOne xrandr

Nov 26 2013

Remember this to get projector:

xrandr  --output VGA1 -s 1440x900 --right-of LVDS1 --auto

No responses yet

Quick Tip – copying colons over ssh and rsync

Nov 19 2013

If you happen to want to copy a file in the current directory with a colon in the name:

this will fail. Possibly after quite a timeout, with a completely unrelated error (unresolved domain name some-file maybe?)

This is because ssh uses colons to separate the user@host part from the filename part.

The fix when the source is on the local computer is to ensure the path starts with a dot:

Incidentally this is related to that old newby fail, forgetting the trailing colon when coping to a destination home directory:

results in a file in the current directory called ‘user@host’. Oops!

No responses yet

Some funky stuff with git – working with tags

Sep 24 2013

Tags form an important part of any version control system. One of the most common use cases is to mark a revision that corresponds to a released software version; this allows reconciliation of bug reports to the correct code, for example.

There are a few tag related operations that are not immediately obvious in git however.

Listing all commits between two tags.

This is simple enough:

Listing all tags between two revisions.

The following command will list all tags from and including ref OLDEST_TAG_OR_BRANCH, along with the abbreviated commit SHA and brief log.  Omit the caret (^) to exclude the starting point from the list.  HEAD means go until the most recent commit in this line, append a caret to exclude it if it is tagged, or replace HEAD with another tag or branchpoint.

git log, and its cousin git rev-list have quite a lot of formatting and selection options, so checkout the man pages (i.e. man git-log and man git-rev-list ; note, individual git subcommand man pages are accessed by prepending ‘git-’ to the subommand.)

Preserving tags during a filter-branch operation.

The command git filter-branch is a powerful tool that can be used to do things such as change the email address of a user of every commit they made, remove specific words from a commit message, fix common spelling mistakes, rearrange the file tree, etc.  It should not be use ad-hoc as it does effectively rewrite the history, invalidating  working copies so should be use with care.  It is often used when migrating from another system into git, or migrating servers, etc.

One trap with filter-branch is that it is easy to lose tags when using the command.  Filter-branch actually makes a new branch, leaving the original untouched in the repository as a kind of backup, called ‘refs/original/refs/heads/YOUR_BRANCH.  Tags remain attached to this _original_ branch, and are not copied to the new branch unless the –tag-filter is employed concurrently with whichever filter is being used.  Then when you use or clone the replacement YOUR_BRANCH, the tags are “gone”!  This is easily fixed:

(a) Edit every commit message in current branch, removing the word “frooble” and replacing it with “foo”.  This happens to “forget” tags on the way

(b) Edit every commit message in current branch, removing the word “frooble” and replacing it with “foo”, this time making sure the tags move to the branch.

Pushing your tags to your remote.

Remember the –tags flag, so that your tags go up to github, etc:

Removing tags on remote.

No responses yet

« Newer posts Older posts »