N.B. HTML5: Any imgs with spaces in name need those changed to "%20"--> HOME  - -  Lazarus/ Delphi tutorials

Processing a file of lines

Or lines of data from another source
A general overview

(Page's URL: 3linesfs-simp.htm)

This follows on from a more general discussion of processing a number of LineRTs from a SourceRT.

In this page, I am going to work up the code for running through a data source holding...

Alaska, 660,000
Texas, 270,000
California, 160,000
Montana, 140,000
New Mexico, 120,000

(Those numbers are approximate- areas in square miles) ... and getting 1,350,000.

(Puzzled by "LineRTs" and "SourceRT"? The "RT" is to say I am, for this group of tutorials, using "line" and "source" in very narrows senses of those words. I have made them Reserved Terms here. The "RT" issue has a minor page of its own.)

Fasten seat belt

In the more general discussions, I pointed out a problem... but was vague about the solution.

I hope you will have read the more general discussion before starting this. If you have, I make no apologies for the fact that I've carried bits of text forward from there. Stay awake! I've also, here, extended those passages!

With some SourceRTs, you can know how many LineRTs there are from the outset. They are quite easy to process without getting all tangled up in code that may not work in every instance. Quite easy. It's still possible to write the code badly.

But with other SourceRTs, you don't know how many lines there are. If you read the first, the second, the next, the next, the next... eventually the subroutine that you use will return a message that says "The last time you tried to read a LineRT, you read the last LineRT in the SourceRT."

We're going to look at a way to overcome this problem in the course of this. It will be done in Lazarus, but without any "special-to-Lazarus" elements. This essay should help you regardless of what language you are using.

Open SourceRT
Repeat
  FetchALine
  Process what's in it
  Until all lines done
Close SourceRT

("FetchALine", by the way, is something we will write. It is not built into Lazarus.)

The pseudo code above is "nice and simple". A bit too simple, alas.

What happens if you're accessing a SourceRT where you can't know you've read the last LineRT at the moment you read it? (There are SourceRTs where you only learn that when you issue the next FetchALine. THEN you learn that the PREVIOUS "FetchALine" fetched the last LineRT of the SourceRT. Sigh.

The answer!

The answer depends on a small bit of "cleverness", and a carefully constructed "FetchALine" subroutine.

At the start of the whole exercise, we will "pre-fetch" a line before we enter the main "process stuff" loop.

We will use the data in the "pre-fetched" line on the first pass through the loop.

Then, near the end of the loop, we will try to fetch another line. If successful, we'll go back, do the loop. If not, we can just "drop out" of the loop, pass on to the next part of the application.

All that "pre-fetch" stuff is the "bit of cleverness".

The "carefully constructed "FetchALine" subroutine not only fetches a line (if possible) but it ALSO returns an "error code". (The code may stand for "no error detected", in which case "error code" is not a very good name, but that's what these things are always called.)

For our needs here, we need FetchALine to return the following in the various situations given...

The Code

Hurrah! "Foundations" laid. I hope the almost-Lazarus that follows makes sense now? It should! Re-read what's above, until it does.

In 3linesfs-main.htm, I used a variable called boMoreToDo. Here, iErrReturnedFrmFetch takes care of the things boMoreToDo did... and does other jobs, too!

The "error" codes FetchALine could return might be...

(For a given "environment", you might always get the type "1" error, and never need the type "2" error. But the code we'll have by the end of this will be general... able to cope with that environment, or one where you never get a type "1" error, and have to depend on the type "2" error.)

It might have been easier to "spilt" this differently.

With the system presented here, the body of code for processing LineRTs from a SourceRT is the same, whether the subroutines written by other for accessing what's in the SourceRT can tell you "you just read the last LineRT" or not. (The alternative system is that you learn there are no more LineRTs the NEXT time you attempt to read one, whereupon you are told... subtly different... "You read the last LineRT on a prior attempt to read a line.)

As I say, you could write two versions of the body of code to process LineRTs. Maybe you should! But I am pressing on with one that can accommodate either system.

The FetchALine subroutine that you write will have to be "right" for the way the SourceRT involved serves up lines.

If the SourceRT "tells you" that you've just read the last LineRT, then it quite possibly won't have a different error code for "You read the last line already".DOES MY SYSTEM **NEED** you to provide detection of a "read too many times"??? (Even if not, make it flag an error somehow, if I am wrong about the trigger NOT being needed.) SEE ALSO m/s "Can Come out?" note in margin of last flowchart

So... in outline...

Open SourceRT;
sResultFrmFetch:=FetchALine;//This is the "do it
   //once "pre-fetch"
//FetchALine also fills iErrReturnedFrmFetch
//If you wish, you can do this more elegantly, by
//  making a record to hold both sResultFrmFetch and
//  iErrReturnedFrmFetch... but I wanted to make this
//  accessible to people who wouldn't know how to
//  avail themselves of the serious benefits that
//  would entail.
boNotDoneYet:=false;//Presume "done" for a moment.
boDoOnceMore:=false;//Presume for a moment.
if (iErrReturnedFrmFetch=0)
  or (iErrReturnedFrmFetch=2) then begin
        boNotDoneYet:=true;
        boDoOnceMore:=true;
        end;
While boNotDoneYet do begin
  Process sResultFrmFetch
  Do another FetchALine (Which refills sResultFrmFetch
     and iErrReturnedFrmFetch with new values)
  if iErrReturnedFrmFetch=2 then boNotDoneYet:=false;
  if iErrReturnedFrmFetch=1 then
    begin
      boDoOnceMore:=false;//Won't halt loop... yet!
      end;
  end;//... of the Begin that the While... do... started.
Close SourceRT

I put skeleton of that together. You don't "need" to load it, but it might interest you to see how my second stage, as I go about building an application. The first stage is the planning. What will it do? How will it do it. Time not spent on that phase is always expensive. But planning isn't as much fun as writing! Anyway, once I'd don as much planning as I could force upon myself, I put the skeleton together.

unit ldn155linesfrmsrcU1;

//EARLY STAGE: TO ILLUSTRATE THAT THINGS CAN BE BUILT UP
//  FROM "SHELLS"

{$mode objfpc}{$H+}

interface

uses
  Classes, SysUtils, Forms, Controls, Graphics, Dialogs, StdCtrls;

const vers='11 Sept 21';

(*Started 11 Sept 21, for wywtk.com\lut\3linesfrmsrc\3linesfs-simp.htm
*)

type

  { Tldn155linesfrmsrcF1 }

  Tldn155linesfrmsrcF1 = class(TForm)
    buQuit: TButton;
    buDoIt: TButton;
    laTxtResult: TLabel;
    laTotal: TLabel;
    meSourceOfLines: TMemo;
    procedure buDoItClick(Sender: TObject);
    procedure buQuitClick(Sender: TObject);
    procedure FormCreate(Sender: TObject);
  private
    iAccum:integer;

    procedure InitializeAccumulator;
    function bOpenSrcOfLines(sSourceL:string):byte;
    procedure FetchALine(var sLineFrmSourceL:string;var iErrFAL_L:integer);
    procedure CloseSource;

  public

  end;

var
  ldn155linesfrmsrcF1: Tldn155linesfrmsrcF1;

implementation

{$R *.lfm}

{ Tldn155linesfrmsrcF1 }

procedure Tldn155linesfrmsrcF1.FormCreate(Sender: TObject);
begin
    caption:='LDN155-Lines from Source, vers: '+vers;
    application.title:='LDN155-LinesFrmSrc';
end;

procedure Tldn155linesfrmsrcF1.buQuitClick(Sender: TObject);
begin
  application.terminate;
end;

procedure Tldn155linesfrmsrcF1.buDoItClick(Sender: TObject);
var bErrOpenSrcOfLines:byte;
    sSource,sLineFrmSource:string;
    boNotDone:boolean;
    iErrFAL:integer;

begin
  InitializeAccumulator;

  sSource:='../StatesWithArea.txt';
  bErrOpenSrcOfLines:=bOpenSrcOfLines(sSource);

  case bErrOpenSrcOfLines of
    0: begin
        boNotDone:=true;
        FetchALine(sLineFrmSource,iErrFAL);
        laTotal.caption:='Source opened without error';//This may be over-
          //written almost immediately in finished application, but may
          //be useful during prgm development.
        end;//case 0
    1: laTotal.caption:='(explain err 1';
    2: laTotal.caption:='(explain err 2';//Yes... ; before an else- the else of a CASE...
  else
    laTotal.caption:='Unexpected error occured during OpenSource';
    end;//case


  iAccum:=0;  //for debug

  //A very modest start to something that will grow considerably!....
  while boNotDone do begin
   inc(iAccum);
   if iAccum>5 then boNotDone:=false;// NOT not done... i.e., we are.
   end;//Of "while"
  //End of "a  very modest start..."

  laTotal.caption:=inttostr(iAccum);
  CloseSource;
end;//of buDoItClick

procedure Tldn155linesfrmsrcF1.FetchALine(var sLineFrmSourceL:string;var iErrFAL_L:integer);
begin
end;//of FetchALine

procedure Tldn155linesfrmsrcF1.InitializeAccumulator;
begin
  iAccum:=0;
end;//of InitializeAccumulator

procedure Tldn155linesfrmsrcF1.CloseSource;
begin
  //xx
end;//of CloseSource

function Tldn155linesfrmsrcF1.bOpenSrcOfLines(sSourceL:string):byte;
begin
  result:=0;
end;//of OpenSrcOfLines

end.

After that was in place, it was "just" a matter of "filling in" the details!

A tiny detail...

Soon after the above was done, I added a second memo, meOut

In order to provide the "cleverness" that you see when you run the app (.exe available in the demo code you can download with link further down page (MAN NOT BE THERE YET! This being written 12 Sep 21) which lets you "slide" the boundary between meSource on the left, meOutput on the right... and have everything resize nicely... is down to a TSplitter element on the form, plus the magic of the Anchor Editor. All of that the subject of another tutorial someday! For now: "The secret", once you master the anchor editor, it that you start by using the object inspector to change the TSplitter's "Align" property to alNone. For some reason that is not clear to me, you can't turn it off simply by un-ticking the "enabled" boxes using the Anchor Editor. The TSplitter object is nothing very special. The magic happens with the settings you make with the Anchor Editor for it, and the two memos. Think of the TSplitter object as a passive "bar" that you can move around on the form.

A eensy teensy annoyance: I somehow got things muddled, and found that if I'd used the Anchor Editor to set the position of something, i.e. the left, top, width or height of a component, I couldn't thereafter, no matter how I "played" with what was enabled, what things were "connected" (or "not") to, I couldn't thereafter change those values for that with the Object Inspector! Not a big problem... just set them by anchoring the "thing" to the form, and choose the "rule" (Anchor right-to-left/ center/ anchor right-to-right, etc) to What Works. (The "rule" is set with the buttons on the Anchor Editor.)

The Anchor Editor IS a wondrous thing, and very useful... but like any powerful tool, there are some learning curve foothills to conquer! (See the wiki.lazarus page if you want to work with a TSplitter.)

Purpose of meOutput

I put meOutput on the form to give myself a place to put things to let me see how the processing of the LineRTs from the SourceRT was going, as the program executes.

Of course, the whole of the processing that happens when I click the DoIt button will normally be nearly instantaneous, but by sending things to meOutput, I can create a history of the event.

As a start, I added lines like...

meOutput.lines.add('bOpenSrcOfLines function completed');

... to the developing app. That line was, of course, was inserted at the end of the bOpenSrcOfLines function. Add others as you build out the application.

So far, so good

So far, I had an app that would open a SourceRT so that, eventually, I could fetch lines from it, and process them.

But, as yet, there was no "Fetch" or "Process"! I only had a tiny "thing" inside DoIt...

  //A very modest start to something that will grow considerably!....
  while boNotDone do begin
   inc(iAccum);
   if iAccum>5 then boNotDone:=false;// NOT not done... i.e., we are.
   end;//Of "while"
  laTotal.caption:=inttostr(iAccum);
  //End of "a  very modest start..."  

.. but it was in the right place for the Fetch/Process stuff... and other elements of the whole task were in place.

Next...

The next thing to do is to set up the loop that will go through the SourceRT.

For now "ProcessLine" will merely pass the LineRTs to meOutput (so we can see that they were fetched!)

(Initially, after I'd created a ProcessLine as...

procedure Tldn155linesfrmsrcF1.ProcessLine(sLineL:string);
begin
  meOutput.lines.add(sLineL);
end;//of ProcessLine

... I could see THAT was working by adding just one line in my early "Fetch/Process" "stand in"...

  while boNotDone do begin
   inc(iAccum);
   ProcessLine(inttostr(iAccum));  // <<< THIS NEW <<<<<<
   if iAccum>5 then boNotDone:=false;// NOT not done... i.e., we are.
   end;//Of "while"

See how you can build things, bit by bit, testing as you go along? Important art to master!

Yes, but what does it DO?????

Don't you hate it when people ask that?

But if you've lost sight of what all of this is in aid of, perhaps you can be forgiven.

All of this is "just" to show you some code that will work through a SourceRT, fetching each LineRT from it in turn.

Parts of that will have to be tweaked for the exact SourceRT involved in a given situation... but you should find that you have many apps that can tap essentially equivalent SourceRTs. I frequently use text file as my source, with each "line" of the text file (in text file terms of what a "line" is) being the "line" in the LineRT sense that this application is designed to process.

For a different TYPE of source, you may need to tweak the following... but the rest of the code will remain the same!...

Code that sometimes has to be altered for SourceRT type...

And they won't need much tweaking. The hard part, the choosing of what the inputs and outputs should be, has already been established.

In addition to that...

In addition to that, you will need to work on ProcessLine, almost every time you re-purpose the code. But if you are using a source that has the same characteristics of that used in a previous project, that's ALL you will have to tweak.

I've been programming for more years than most of my readers will have been alive. And it has taken me three days to get this far. It wouldn't take me three days to convert the fruits of this... when I finally get it finished!... to do a different sort of processing of LineRTs from a SourceRT. Having "stock" code, that you use as "boilerplates" for new projects is very worthwhile.

I'm sure that I am only re-inventing a wheel here. But if you like the way I present things, you're welcome to use it.

Anyway... I was tired of re-inventing this wheel over and over again because I have been a slow learner. From now on, I will take my own advice, use this boilerplate the next time I need to process LineRTs from a SourceRT. Turing knows I could have saved many hours over the years if I had created this a LONG time ago. Processing LineRTs from SourceRTs is a common task.


Back to the building work...

So! Our "add up some numbers" program is well along.

You do remember....

Alaska, 660,000
Texas, 270,000
California, 160,000
Montana, 140,000
New Mexico, 120,000

... don't you? (And, if you are working on the program, you should know that they should give 1,350,000, if your program is asked to add them up.)

I've put that in a little text file, StatesWithArea.txt, and already our code can load that into meSource, ready for going through, fetching each line in turn, Processing it (extract the number part, add it to a growing total in the accumulator, iAccum.

For now, "Process" will just put a copy of each line in meOutput... so we can see that the loop works! We can leave the "fancy bits" of "Process" for AFTER we have the loop and FetchLine working. Think, as Mr. Watson pleaded, when you are choosing your path to the summit of Mt.Success.

Note that to get this far, not only have we modified the onClick handler for buDoIt, but also expanded bOpenSrcOfLines and FetchALine.

It may seem daunting that so much has to be done "just" to add up some areas... but remember: When this has mastered, you will have 90% of the work for much more demanding tasks finished... forever!

Nearly "there"!

If you strip the "finished" code back by returning ProcessLine to...

procedure Tldn155linesfrmsrcF1.ProcessLine(sLineL:string);
begin
  meOutput.lines.add(sLineL);
  iAccum:=iAccum+1;
end;//of ProcessLine

... you will see my code ALMOST as it was at this stage. (The finished code also has some more code in the onClick handler for buDoIt that skips blank lines and comments (more on that later))

The code "Worked"!!

Even if it still only COUNTED the LineRTs fetched, still didn't do anything "clever" with what was in the LineRTs.

But that's the point, really.

Applications do not have to be written from scratch every time.

On to the finish...

First we're going to introduce the idea of "comment LineRTs".

Our SourceRT, at the moment must look something like...

Alaska, 660,000
Texas, 270,000
California, 160,000
Montana, 140,000
New Mexico, 120,000

It would be so much better if it could look like...

*Data at 2010, from reference book Pops Of the States

Alaska, 660,000
Texas, 270,000
California, 160,000
Montana, 140,000
New Mexico, 120,000

If we agree a rule...

"Any line that begins with an asterisk, or is blank, should be ignored"

... then we CAN have a SourceRT that looks like the above.

Even that simple proposal, alas, comes with complications. What do you mean by a blank line? You and I, as humans, have no problem with the idea of "a blank line"... but computers, as ever, are fussy.

Suppose the line above "Alaska..." in the example above isn't entirely blank. What if it has a space on it? The line would LOOK blank, to you and me... but the computer would "see" the space. Sigh.

And spaces aren't the only things that can be on a line and look like "nothing". The "TAB" character may or may not cause you problems, depending on the job you are doing.

Early in ProcessLine, the programmer will introduce some checks on the LineRT that has been presented for processing. ProcessLine will have a subroutine to categorize a line. I've called mine bCatLine is a function that returns a byte, to indicate a category for the line passed to it. For my purposes here, I am going to write bCatLine to return "category" codes as follows.

1: Line had nothing on it
2: Line started with a "*"
0: All other cases

Note that bCatLine doesn't do anything about any of these circumstances. It merely reports them to the calling program. And, as I've written it, bCatLine would return 0 in iErrFAL_L if the line consisted of one or more spaces (and/or TABs). I'll just have to be careful in the preparation of my source files.

(Speaking of which: DO provide for blank lines! With many text editors, it is easy to acquire a blank line at the end of a file.)

So far, so good, so general.

So! We have code to scan a SourceRT.

Just to finish off the example task proposed...

This-task specific "tweak"

We'll have to introduce some code that is specific to this task... but it "all" (mostly!) goes in the ProcessLine subroutine.

We have got to the point that we can "see" the "good" lines as they pass. One would be...

Alaska, 660,000

All we have to do is "pluck" the "660,000" out of that, add it to what we have in iAccum. Then Job Done.

The code was put in "iParseLine". It would, for the line given, return 660,000 as an integer, -999 if no number could be found in the line passed to it.

(My "parse line" is, by the way, is, I know, incredibly crude and ugly. Robust. But crude and ugly. It was late. I was tired. Sigh.)

That's it?

I hope that covers everything??

When I got to this point, it was late, I was tired... but I THINK "everything" has been accomplished... See what you think? The .exe, sample data files, and all of the sourcecode is available from the following downloadable .zip....

ldn155linesfrmsrc.zip

A word about the name used for this tutorial's URL... [[to be done!!]]

A few words from the sponsors...

Please get in touch if you discover flaws in this page. Please mention the page's URL. (wywtk.com/lut/3linesfrmsrc/3linesfs-simp.htm).

If you found this of interest, please mention in forums, give it a Facebook "like", Google "Plus", or whatever. If you want more of this stuff, help!? There's not much point in me writing these things, if no one feels they are of any use.



index sitemap
What's New at the Site Advanced search
Search tool (free) provided by FreeFind... whom I've used since 2002. Happy with it, obviously!

Unlike the clever Google search engine, this one merely looks for the words you type, so....
*    Spell them properly.
*    Don't bother with "How do I get rich?" That will merely return pages with "how", "do", "I"....

Please also note that I have three other sites, and that this search will not include them. They have their own search buttons.

My SheepdogSoftware.co.uk site, where you'll find my main homepage. It has links for other areas, such as education, programming, investing.

My SheepdogGuides.com site.

My site at Arunet.




How to email or write this page's editor, Tom Boyd. Please cite page's URL, 3linesfs-simp.htm, if you write.


Test for valid HTML Page has been tested for compliance with INDUSTRY (not MS-only) standards, using the free, publicly accessible validator at validator.w3.org. It passes in some important ways, but still needs work to fully meet HTML 5 expectations. (If your browser hides your history, you may have to put the page's URL into the validator by hand. Check what page the validator looked at before becoming alarmed by a "not found" or "wrong doctype".)

AND passes... Test for valid CSS


Why does this page cause a script to run? Because of the Google panels, and the code for the search button. Also, I have my web-traffic monitored for me by eXTReMe tracker. They offer a free tracker. If you want to try one, check out their site. Why do I mention the script? Be sure you know all you need to about spyware.

....... P a g e . . . E n d s .....