As part of a recent project, I was converting a KML (Keyhole Markup Language) file to a GPX (GPS Exchange Format) file which required the extraction of longitude, latitude and elevation coordinates into a comma delimited string format. After extracting the into a string, the string needed to be parsed into three fields. As data could vary in size up to millions of coordinates, this necessitated a method that would not impact on process time nor create memory issues when converting a delimited string into fields.
Whilst it is simple to convert a text file to fields easily, it is converting an available delimited string into different text fields that requires a changed approach. Other scripting languages’ built in techniques makes it really easy to parse text into different formats. Delphi requires a slight difference in approach to achieve the same.
I will share the various methods that were tried and the most effective method chosen. The code can be downloaded at the end of the blog.
Four Methods Tried
So I tried a few direct methods, all with large test data and was rather surprised and disappointed by the results.
Firstly I streamed the string into a string array and parsed it directly to a multi-string array using the standard maths formula “(i + j ) * ncols”. Straightforward but perhaps a little overkill however anything over 3000 records and memory issues kills the process.
Secondly extracted the sting straight into a stringlist and then parsed this directly to a multi-string Array using the above process. Whilst it improved, memory issues still persisted.
So it seems that the methods above may be exceeding the physical memory limits and processor address space limits.
I tried the TBatchmove component with the FDBatchMoveTextReader. It has the option to load from a stream method. Whilst loading a file is always effective, the stream method would not parse the string, with or without sent headers into fields?
Chosen Coding Method
Parsing the string into a base Tstringlist and thereafter parsing the data into three other Tstringlists as fields is the most efficient method over all the other methods that were tried.
I will only share the important parts of the code as the full downloadable code is available via Github at the end of this blog.
This method makes use of four stringlists. The first stringlist holds all the original string items split out from the delimited text. The other three stringlists are used to receive the parsed data and will act as the three individual field’s data. This seems to be the easiest process that reduces the chances of receiving memory issues for large string data as each stringlist is able to hold up to 134,217,728 strings.
After creating the four stringlists the temporary stringlist it is important to set the StrictDelimitor to true with the delimiter char set and the Delimitedtext routine loads the string directly into the stringlist.
The StrictDelimitor forces the use of the delimiter and the Delimitedtext is used to receive the string. The use of the trim function is used to remove excess white spaces.
To parse the sTemp (stringlist) into the three fields the “for ..do” is used to loop through all of the sTemp (stringlist) rows.
We need to extract every third item which is determined by the Maths mod function.
We use this function to extract every third value i.e. The Value mod three (3) must equal to zero (0).
As we are looping from zero (0) will result in dividing by zero (0) such as “0 Mod 0” and hence will result in creating a calculation error. Adding a simple if statement will take care of this.
Whilst not apply to this example and some info, be aware that the Delphi Mod much like C# does not return the correct values for negative integers. Most other languages C, C++ take this into account.
Each field row is updated when stringlist.add() is called and adds the following respective field values
The code for this is as follows:
for i := 0 to sTemp.Count - 1 do begin //first field if i = 0 then sfield1.add(sTemp); // extract first field name if (i > 2) and (i < sTemp.Count) and (i mod 3 = 0) then sfield1.add(sTemp[i]+ 0); //second field if (i <= sTemp.Count - 2) and (i mod 3 = 0) then sfield2.add(sTemp[i + 1]); //third field if (i <= sTemp.Count - 3) and (i mod 3 = 0) then sfield3.add(sTemp[i + 2]);
From now on it is easy to parse the three fields directly to a Database, XML file or Json by looping through the length of any one of the three fields
For j = 0 to Sfield1.Count – 1 do Begin Field1 := sfield1[i]; Field2 := sfield2[i]; Field3:= sfield3[i]; End; // where Field1, Field2, Field3 can be a DB or in my case Longitude, Latitude and elevation //data.
Well this sums up the best method for parsing a delimited text string into fields.
Feel free to download the code here
August 2019Delphi Delimited String to Fields
June 2019Delphi A Professional VCL DBGrid Part Four
May 2019Delphi A Professional VCL DBGrid Part Three
April 2019Delphi A Professional VCL DBGrid Part Two
March 2019Delphi A Professional VCL DBGrid Part One
November 2018Delphi VCL Buttons in DBGrid
October 2018Two Helper Apps for Delphi LibUSB
September 2018Delphi Libusb Library Introduction
August 2018Delphi Object directly to a Json string in a REST Client
July 2018Delphi FMX Leaflet Plotter using OSM Maps
June 2018C2PAS32 Convertor Application
May 2018Delphi PDF Embedded viewer with PDF.js
March 2018Delphi FMX - Changing TCharacter to TCharHelper
January 2018Delphi FMX Dashboard using Chart.JS
December 2017PHP Slim REST Server & Delphi Auth Part 5
November 2017Delphi FMX REST Client App Part 4
October 2017Delphi VCL REST Pricing Client App Part 3
September 2017Delphi REST VCL Client Basic Auth Part 2B
August 2017Delphi REST Client Part 2A
July 2017PHP REST Server and Delphi Client Intro
June 2017Delphi SQLite Encryptor-Decryptor Tool
May 2017Create a Visual IP Address Geolocation with PHP
March 2017PHP Downloader using Countdown timer
January 2017Morris Charts and PHP-PDO
December 2016CSS to create a functional Toggle Button