Jump to content

Hi guys, I am working on a csv file have rows approximately ...

VisuaLizeFOCUS .

Recommended Posts

Hi guys,

I am working on a csv file have rows approximately around 10 to 20 million of rows.

Report takes around 6-7 mins to load.

How can i increase its performance since i am using csv file so there is no connection dependency and also i am not doing any custom calculations in fex code.

Let me know your thoughts.


Link to comment
Share on other sites

Reading 10M rows from a CSV file takes a long time. The point Frans and I are making is that even though you CAN write reports using a CSV like this, it doesnt mean you SHOULD. Why Because when you read a sequential file like that, none of your WHERE tests, DEFINEs, COMPUTEs, BYs or ACROSSs will be applied until you read all 10M rows into memory. If youve no choice than to read the CSV, then the only other suggestion I can offer is to either run this report Deferred so it runs in the background, or schedule. However, if this data has to be used repeatedly, itll make more sense to incur the cost of loading it into a database (HYPRESTAGE, DATREC or FOCUS) once and then write your reports using that data, since itll be a damn sight faster than reading 10M rows out of a CSV.
Link to comment
Share on other sites

A few questions:

How many rows are on your output after you read the file

Are you aggregating your data

Do you have DEFINEs in there that could be moved to COMPUTEs

Do you have WHERE tests that are in there

Do you have to sort the report different ways or is the data already in the right order

Are you JOINing to anything else

Are you using the WEBVIEWER to see one page at a time

What Im thinking of is wondering if using TABLEF might help you (if you dont need sorting).

In general, youll get some flexibility out of putting your data in a database if thats possible and Id really recommend that.

5 or 6 minutes to read and aggregate 10m rows isnt all that bad for reading a flat file (at least as far as I know).

This reminds of the old days of FOCUS on the mainframe. If you are doing WHERE tests, you might be able to find an external utility to strip the rows out thats faster than FOCUS (thinking of Sync Sort). Strip the rows out, then report of the resulting file.

Link to comment
Share on other sites

  • 4 weeks later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...