Posts Tagged ‘API’

When is Tower Bridge opening

January 20, 2018

One of the traits that I believe makes a good product manager outside of knowing your customer(s) is being able to recognise a real problem. For a while now I have had a “problem” that has bugged me and I have wanted to solve for myself (I see the customers of this product those that suffer the same problem I do).

What is my problem?

I commute by bicycle across London. I live in South East London and I work north of the river. This means I have a choice of bridges to cross, from Waterloo down to Tower Bridge. In the morning I am more of a creature of habit and typically cross Southwark bridge, however in the evenings I like to mix it up a bit and typically as I leave my place of work I try to decide which bridge I should cross. Tower Bridge is actually one of my favourite bridges (from the choice) to cross, however Tower Bridge has a risk associated with it – will it be open when I arrive to cross it? This is definitely a first world problem, but if the bridge is open you don’t know until the approach at which time it is too late to turn around and take another bridge (I know I have tried), and then you have to sit and wait for the operation to complete. Whilst it is nice to witness, it adds at least 20 minutes to the commute (effectively doubling the time), and compresses all of the traffic so making it more dangerous.

My problem, fundamentally, is I need to know as I leave the office is Tower Bridge due to open that evening, if so I will take another bridge. I always check my phone before I leave the office, so if I had a notification I would be alerted to this fact and make an informed decision.

The research

So the first bit of research is, can you get the time of the openings on the web and the simple answer is yes, Tower Bridge kindly provide this on their website: http://www.towerbridge.org.uk/lift-times/ a quick Google search gave me this information.

Next up is, do I need to scrape this data and then offer this data as an API service, or am I lucky and has someone already done this? Well, another Google search and I have an answer. https://tower-bridge-lifts.herokuapp.com/ provides a RESTful service where I can get the time and date for the next lift (as well as vessel name and direction of travel)! This work was completed by Andre Parmeggiani, you can see his other work on his github page – I am very grateful that he did this work.

So that is it, everything is perfect I have the foundations in place to build the product (something I wanted to do for myself) and I had a week break over the Christmas period. Time to break open the Head First Android Development book and start learning how to code up an app.

The Product

Well, the simple solution is an MVP in the shape of a mobile app that tells me when the next opening is. My MVP is to consist of:

  • I can interact with app to check what is the next opening of the bridge
  • I get an alert within 4 hours of the next opening of the bridge so that when I check my phone before setting off I will see the alert and make an informed decision of the course to take.

mobile-screenshot

Now I will admit, I think I might have gone a too much on the minimum for the interface, as I went with as basic as possible (I only had a limited amount of time). However I think this MVP is viable as testing showed I can interact and get the next opening information and I get alerts / notifications. I have also become a bore as I can tell you when the next opening is…. 🙂

I am pleased to announce the app is available from the play store: https://play.google.com/store/apps/details?id=com.itssmee.apps.towerbridge next up is to write the same app for the iPhone.

Please help yourself, and if you like it please rate me and I am happy to take suggestions.

And finally, thank yous

I would to thank Andre Parmeggiani and his API

Also, I would like to thank Nicolas de Camaret for his Flickr post which I used as background for the app – I really appreciate that.

512

Advertisements

Using Apache’s Sqoop as an API

October 2, 2014

Been doing some work with Sqoop recently and using the book Apache Sqoop Cookbook which I found very easy to follow. However, do we really want to do everything via the command line interface (CLI), I think not. Surely we also want the option to do it via an application programming interface (API).

I did some research and found some examples (here and here), so I thought I would document what I have done taking some of the examples from the book mentioned and turn these into code. I hope you find this useful.

Which Sqoop am I talking about, there are two!!!!

I am making reference to Sqoop 1, the original. The version I am using at the time of writing is 1.44.

Where do I get the API library from?

The library is sqoop-1.4.4.2.1.1.0-385.jar (well that is the library I am using). For me, it is included in the Hortonworks VirtualBox Hadoop Sandbox 2.1 I am using, it can be found in

 /usr/lib/sqoop/sqoop-1.4.4.2.1.1.0-385.jar 

What do I can?

You can do everything you can do at the CLI, but within your code rather than having to fork to the shell to sync data into HDFS.

How does it work?

I thought I would run through some of the examples in the book, but before doing that you need to look at “The Code” section below, as I have encapsulated the methods to try and make it easier. The book covers syncing a complete table, specifying the HDFS location you want the data in, compressing the data, placing a Hive table over the top, and the list goes on.

I will provide you with the API examples based on my code, as well as the CLI commands so you can get a good understanding. I am basing all of my examples on a simple MySQL database and table(s) I have created, these are really simple.

Syncing an entire table:

CLI:


sqoop import --connect jdbc:mysql://localhost/sqoop \
--username TheUsername \
--password ThePassword \
--table cities

API:


public static void main(String[] args) {

  setUp();

  TransferringEntireTable("cities_api");

  runIt();

}

 

Syncing an entire table, and specify the directory:

CLI:

sqoop import --connect jdbc:mysql://localhost/sqoop \
--username sqoop \
--password password \
--table cities \
--warehouse-dir /etc/input

API:

public static void main(String[] args) {

  setUp();

  TansferringEntireTableSpecificDir("cities_api","/etl/input");

  runIt();
}

Syncing an entire table, specifying the directory and compressing

CLI:

sqoop import --connect jdbc:mysql://localhost/sqoop \
--username sqoop \
--password password \
--table cities \
--compress \
--compression-codec org.apache.hadoop.io.compress.BZip2Codec \
--warehouse-dir /etl/input/compressed

API:

public static void main(String[] args) {

  setUp();

  CompressingImportedData("cities_api","/etl/input/compressed","org.hadoop.io.compress.BZip2Codec");

  runIt();
}

An incremental import of a table

CLI:

sqoop import --connect jdbc:mysql://localhost/sqoop \
--username sqoop \
--password password \
--table visits \
--incremental lastmodified \
--check-column last_update_date \
--last-value "2014-08-14 00:00:00"

API:

public static void main(String[] args) {

  setUp();

  incrementalImport("booking_api","/etl/input",IncrementalMode.DateLastModified,"last_modified","2014-05-14 00:00:00");

  runIt();
}

Syncing an entire table and putting a Hive table over the top.

CLI:

sqoop import --connect jdbc:mysql://localhost/sqoop_test \
--username sqoop \
--password password \
--table cities \
--hive-import \
--warehouse-dir /etl/input

API:

public static void main(String[] args) {

  setUp();

  TansferringEntireTableSpecificDirHiveMerge("cities_api","/etl/input");

  runIt();
}

Syncing a partitioned table and updating/creating the Hive table.

CLI:

sqoop import --connect jdbc:mysql://localhost/sqoop \
--username sqoop \
--password password \
--table visits_by_day \
--hive-import \
--hive-partition-key visit_day_partition \
--hive-partition-value "2014-10-01"

API:

public static void main(String[] args) {

  setUp();

  TransferringEntireTableSpecificDirHivePartition("visits_by_day","/etl/input","visit_day_partition", "2014-10-01");

  runIt();
}

Below is the code, you can see how I have encapsulated the methods to try and make the implementation as “simple” as possible.

The Code:


package sqoopexamples;

import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.SqoopOptions.IncrementalMode;
import com.cloudera.sqoop.tool.ImportTool;

public class SqoopExamples {

  private static SqoopOptions SqoopOptions = new SqoopOptions();
  private static final String connectionString = "jdbc:mysql://127.0.0.1/sqoop_test";
  private static final String username = "MySQL database username";
  private static final String password = "MySQL database password";

  private static void setUp() {
    SqoopOptions.setConnectString(connectionString);
    SqoopOptions.setUsername(username);
    SqoopOptions.setPassword(password);
  }

  private static int runIt() {
    int res;
    res = new ImportTool().run(SqoopOptions);
    if (res != 0) {
      throw new RuntimeException("Sqoop API Failed - return code : "+Integer.toString(res));
    }
    return res;
  }

  private static void TransferringEntireTable(String table) {
    SqoopOptions.setTableName(table);
  }

  private static void TansferringEntireTableSpecificDir(String table,
  String directory) {
    TransferringEntireTable(table);
    SqoopOptions.setWarehouseDir(directory);
  }

  private static void TansferringEntireTableSpecificDirHiveMerge(String table,
  String directory) {
    TansferringEntireTableSpecificDir(table,directory);
    SqoopOptions.setHiveImport(true);
  }

  private static void TansferringEntireTableSpecificDirHivePartitionMerge(String table
    , String directory
    , String partitionKey
    , String partitionValue) {
    TansferringEntireTableSpecificDirHiveMerge(table,directory);
    SqoopOptions.setHivePartitionKey(partitionKey);
    SqoopOptions.setHivePartitionValue(partitionValue);
  }

  private static void TansferringEntireTableWhereClause(String table,
    String whereClause) {
    //To do
  }

  private static void CompressingImportedData(String table, String directory
  ,String compress) {
    TansferringEntireTableSpecificDir(table,directory);
    SqoopOptions.setCompressionCodec(compress);
  }

  private static void incrementalImport(String table
    , String directory
    , IncrementalMode mode
    , String checkColumn
    , String lastVale) {
    TansferringEntireTableSpecificDir(table,directory);
    SqoopOptions.setIncrementalMode(mode);
    SqoopOptions.setAppendMode(true);
    SqoopOptions.setIncrementalTestColumn(checkColumn);
    SqoopOptions.setIncrementalLastValue(lastVale);
  }

  private static void TransferringEntireTableSpecificDirHive(String table,
                                                        String directory) {
    TransferringEntireTableSpecificDir(table,directory);
    SqoopOptions.setHiveImport(true);
  }
 
  private static void TransferringEntireTableSpecificDirHivePartition(String table,
                                                        String directory
                                                        , String partitionKey
                                                        , String partitionValue) {
    TransferringEntireTableSpecificDirHive(table,directory);
    SqoopOptions.setHivePartitionKey(partitionKey);
    SqoopOptions.setHivePartitionValue(partitionValue);
  }
}

And Finally

There is another way to use the Sqoop API, not one that excites me but I thought I would share it:

int performHdfsImport() throws SqoopException {

              SqoopTool tool = new ImportTool();
              String[] args = {
              "--connection-manager", "<em>driver</em>"
              , "--connect", "jdbc:db2://dle-db2edw01.dl.karmalab.net:50001/EXPDEV01"
              , "--username", "<em>username</em>"
              , "--password", "<em>password</em>"
              , "--table", "<em>table_name</em>"
              , "--target-dir", "/<em>directory</em>/<em>name</em>"
              };

              SqoopOptions options = new SqoopOptions();

              try {
                options = tool.parseArguments(args, null, options, false);
                tool.validateOptions(options);
              } catch (Exception e) {
                    System.err.println(e.getMessage());
                throw new SqoopException(e.message);
            }
            tool.run(options);
           }
   return tool.run(options);
}

The reason I am not that keen on this is, it doesn’t really differ from running it via the CLI, I like having the control of setting the values via the methods with this you would have plenty of final strings as the keys and then assigning the corresponding values and then populating the string array to pass into the method. Not my cup of tea.

A Treemap to find a rental bike in London using Processing

August 30, 2010

I was recently searching the web for news stories about the new rental bikes in London. In the process I stumbled across an API, produced by Adrian Short.

Upon seeing this data I had an epiphany!!! I have come across numerous APIs in my travels of the web, but I realised I could do something with this data!

Boris Bikes

One of the comments journalists have made, along with a few friends, is that either trying to locate a bay of bikes, knowing if there are any bikes available, or if there is room to park your bike when you get there. Here the treemap comes into its own.

Some background on treemaps (for those who don’t know): “Ben Schneiderman, of the Human-Computer Interaction Laboratory (HCIL) at the University of Maryland, worked out a set of algorithms to subdivide 2D space to show the relative size of files and directories (see http://www.cs.umd.edu/hcil/treemap-history). He call the representation a treemap, referencing the two-dimensional mapping of tree structures.” [quote taken from Ben’s book]

I had the idea from Ben Fry’s book Visualising Data in which he uses Ben Shneiderman’s Treemap software (which I have also utilised) http://www.cs.umd.edu/hcil/treemap-history.

Before visualising this data as is, I decided I wanted to create a top layer, that is the location (i.e. North, East, South, West) of London you want to investigate, and when you drill in you can then view all of the regions in that location, and then you can drill in further to see the specifics. For each level I decided to have a info box follow the mouse point to give the full details of what you were looking at. Remember the size of the rectangle represent how full the dock is, if it is full of bike the rectangle is large, if it empty of bike the rectangle is small.

To drill into the data Left Mouse on the Party which will take you in a level. To return a level, or more, Right Mouse Click.

I can’t thank Adrian Short enough for this, Boris Bikes for hosting the link to the API, and the Processing project.

I really like how it looks (no surprise there!)?

As ever this is a working prototype and I would interested to hear any feedback? Of course, because this is a Processing Application it is a Java Applet and because the applet is accessing the API it needs to be signed. You will have to allow access for the applet to work. Have a play (http://bikefacts.co.uk/Processing/Boris/inde.html).