Sunday, October 2, 2016

Importing GPX files to Oracle database, part 1

Update 2016-10-16: Found and error in procedure, see below.

In preparation for a presentation on Oracle Spatial for DBAs I decided to import some GPS-data. I have an app called GPSLogger on my Android mobile that logs GPS data to a file. The format for this is GPS Exchange Format (GPX).  Since I spent some time understanding the XML side of this I decided to blog about it.

Note, there are two versions of the GPX schema, 1.0 and 1.1. My app uses the version 1.0, if you try to import a GPX file, have a look in it to verify which schema definition it uses (Look for xsi:schemaLocation in the head of the GPX file).

I followed the information from this blog post (that actually uses version 1.1) together with info from Oracle Support, and lots of googling.

The app can upload to Dropbox which makes it easy to transfer the file to the database server. In addition I needed the schema definition, which you can get from Topographix.com, you'll find the link to the XSD file in the GPX file as mentioned, in my case it was http://www.topografix.com/GPX/1/0/gpx.xsd

However, the file needs some adjustment in order to have Oracle parse values of type xsd:dateTime correctly. Open it in an editor and look for the following block:

<xsd:schema
        xmlns:xsd="http://www.w3.org/2001/XMLSchema"
        xmlns:gpx="http://www.topografix.com/GPX/1/0"
        targetNamespace="http://www.topografix.com/GPX/1/0"
        elementFormDefault="qualified">

At the end of this,  add one line so it looks like this:


<xsd:schema
        xmlns:xsd="http://www.w3.org/2001/XMLSchema"
        xmlns:gpx="http://www.topografix.com/GPX/1/0"
        targetNamespace="http://www.topografix.com/GPX/1/0"
        elementFormDefault="qualified"
        xmlns:xdb="http://xmlns.oracle.com/xdb>"

Then you have to search through the file and search for every definition that uses type "xsd:dateTime" and add:

 xdb:sqltype="TIMESTAMP WITH TIME ZONE"

so it looks like this:

xsd:element minoccurs="0" name="time" type="xsd:dateTime" 
xdb:sqltype="TIMESTAMP WITH TIME ZONE"

(Take away the line break, I added it to make it readable).
There are four entries, so this manual job does not take a long time. I transferred this file together with my GPX file to a directory on my server (/u01/app/oracle/load_dir). Here is the first part to get started, the first part is executed as SYSTEM:


create user oyvind identified by oracle 
default tablespace users temporary tablespace temp 
quota unlimited on users;

grant create session, create table, create view, create procedure to oyvind;
grant alter session to oyvind;
grant create type to oyvind;

create directory LOAD_DIR as '/u01/app/oracle/load_dir';
grant read, write on directory load_dir to oyvind;

As you can see, I did not granted XDBADMIN to this user.  Update 2016-10-16: The use of DBMS_XDB does indeed require being SYS, the owner of the package (XDB), or having one of the roles XDBADMIN / DBA. Since I did not want an ordinary user to have an admin role, I executed the first part that uses the DBMS_XDB package as SYSTEM. The function below was created in the schema of the ordinary user as well. I must have been experimenting with granting and revoking the XDBADMIN without noticing that the first part did nothing since the resource was already there.

The first step is to load the XML schema definition to the database. I copied the function getClobDocument from the mentioned blog, it is used to load files as CLOB:


create or replace function getClobDocument(
    p_directoryname In varchar2,
    p_filename      In varchar2,
    p_charset       In varchar2 default NULL)
    return          CLOB deterministic
is
  v_file          bfile := bfilename(p_directoryname,p_filename);
  v_charContent   CLOB := ' ';
  v_targetFile    bfile;
  v_lang_ctx      number := DBMS_LOB.default_lang_ctx;
  v_charset_id    number := 0;
  v_src_offset    number := 1 ;
  v_dst_offset    number := 1 ;
  v_warning       number;
begin
  if p_charset is not null then
    v_charset_id := NLS_CHARSET_ID(p_charset);
  end if;
  v_targetFile := v_file;
  DBMS_LOB.fileopen(v_targetFile, DBMS_LOB.file_readonly);
  DBMS_LOB.LOADCLOBFROMFILE(v_charContent, 
                            v_targetFile,
                            DBMS_LOB.getLength(v_targetFile), 
                            v_src_offset, 
                            v_dst_offset,
                            v_charset_id, 
                            v_lang_ctx,
                            v_warning);
  DBMS_LOB.fileclose(v_targetFile);
  return v_charContent;
end;
/


The following code worked for me to create resource:


declare
  schemaURL varchar2(256) := 'http://www.topografix.com/GPX/1/0/gpx.xsd';
  schemaDoc varchar2(30) := 'gpx.xsd';
  xmlSchema xmlType;
  res       boolean;
begin
  xmlSchema := XMLTYPE(getCLOBDocument('LOAD_DIR',schemaDoc,'UTF8'));
  if (dbms_xdb.existsResource('/home/' ||schemaDoc)) then
      dbms_xdb.deleteResource('/home/' ||schemaDoc);
  end if;
  res := dbms_xdb.createResource('/home/' || schemaDoc,xmlSchema);
  if res then
    dbms_output.put_Line('OK');
  else
    dbms_output.put_Line('NOK');
  end if;
end;
/

Make sure you turn on server output and get a proper message to indicate if it succeeded or not; no error message was returned when it failed.

The rest of this is executed as user oyvind (Update: also create the getClobDocument from above to this user). Next step is to register the schema, I used the following code:

declare
  schemaURL varchar2(256) := 'http://www.topografix.com/GPX/1/0/gpx.xsd';
  schemaDoc varchar2(30) := 'gpx.xsd';
begin
  dbms_xmlschema.registerSchema
  (
    schemaURL,
    XMLTYPE(getCLOBDocument('LOAD_DIR',schemaDoc,'AL32UTF8')),
    local => true,
    genTypes => true,
    genTables => false,
    enableHierarchy => dbms_xmlschema.ENABLE_HIERARCHY_NONE
  );
end;
/

Use the following code to create a table to store the document as XMLType:


CREATE TABLE gpx
    OF XMLType (CHECK (XMLIsValid(object_value) = 1)) 
    XMLSCHEMA "http://www.topografix.com/GPX/1/0/gpx.xsd" ELEMENT "gpx";


Then, finally, you can load the GPX file with this:


INSERT INTO GPX 
VALUES (XMLTYPE(getCLOBDocument('LOAD_DIR','20161001.gpx')));
commit; 

To see some of the data, you can extract what you find interesting with:


 SELECT 
       to_number(EXTRACTVALUE(VALUE(t), 'trkpt/@lon'))  longitude,
       to_number(EXTRACTVALUE(VALUE(t), 'trkpt/@lat'))  latitude,
       TO_NUMBER(EXTRACTVALUE(VALUE(t), 'trkpt/ele'))   Elevation,
       EXTRACTVALUE(value(t),'trkpt/src') src
 FROM GPX g,
      TABLE(XMLSequence(extract(g.OBJECT_VALUE,'/gpx/trk/trkseg/trkpt','xmlns="http://www.topografix.com/GPX/1/0"'))) t;


This is an example of what comes out (it was a short walk):


LONGITUDE LATITUDE ELEVATION SRC
10.79831903 59.83451229 85 gps
10.80009538 59.83263633 170 gps
10.80033655 59.83156553 146 gps
10.8003321 59.8303762 network

In next post I plan to use the Spatial datatype to store these points. I may do another test with GPX version 1.1 in the mean time since that is the version used by my Garmin GPS.

Some tips if you need to clean up in case of errors;  if you have the recyclebin enabled, do a "purge recyclebin" after dropping the table with the XMLType. I used to following command to delete the schema:

purge recyclebin;
exec dbms_xmlschema.deleteschema(schemaurl => 'http://www.topografix.com/GPX/1/0/gpx.xsd');


Get in touch if you need a copy of the final gpx.xsd file.