NYCBabak . Posted January 12, 2021 Share Posted January 12, 2021 Batch load the CSV into a DBMS. If you have no DBMS Write access you can create a FOCUS file or DATREC file. Then write your reports using the DBMS. Link to comment Share on other sites More sharing options...
VisuaLizeFOCUS . Posted January 12, 2021 Author Share Posted January 12, 2021 Hi guys, I am working on a csv file have rows approximately around 10 to 20 million of rows. Report takes around 6-7 mins to load. How can i increase its performance since i am using csv file so there is no connection dependency and also i am not doing any custom calculations in fex code. Let me know your thoughts. Thanks. Link to comment Share on other sites More sharing options...
Warren Hinchliffe Posted January 13, 2021 Share Posted January 13, 2021 How are you loading the table MODIFY Link to comment Share on other sites More sharing options...
VisuaLizeFOCUS . Posted January 13, 2021 Author Share Posted January 13, 2021 I am not loading the data into any database. I am using csv file for creating reports. Number of rows in csv around 10 million. Link to comment Share on other sites More sharing options...
Franciscus van Dortmont Posted January 13, 2021 Share Posted January 13, 2021 Load the data in a binary file of better, in a database. Performance will be a lot better if you aggregate some date before loading, so instead PRINT do a SUM with BY to lower the amount of data. Link to comment Share on other sites More sharing options...
NYCBabak . Posted January 13, 2021 Share Posted January 13, 2021 Reading 10M rows from a CSV file takes a long time. The point Frans and I are making is that even though you CAN write reports using a CSV like this, it doesnt mean you SHOULD. Why Because when you read a sequential file like that, none of your WHERE tests, DEFINEs, COMPUTEs, BYs or ACROSSs will be applied until you read all 10M rows into memory. If youve no choice than to read the CSV, then the only other suggestion I can offer is to either run this report Deferred so it runs in the background, or schedule. However, if this data has to be used repeatedly, itll make more sense to incur the cost of loading it into a database (HYPRESTAGE, DATREC or FOCUS) once and then write your reports using that data, since itll be a damn sight faster than reading 10M rows out of a CSV. Link to comment Share on other sites More sharing options...
Toby Mills Posted January 13, 2021 Share Posted January 13, 2021 A few questions: How many rows are on your output after you read the file Are you aggregating your data Do you have DEFINEs in there that could be moved to COMPUTEs Do you have WHERE tests that are in there Do you have to sort the report different ways or is the data already in the right order Are you JOINing to anything else Are you using the WEBVIEWER to see one page at a time What Im thinking of is wondering if using TABLEF might help you (if you dont need sorting). In general, youll get some flexibility out of putting your data in a database if thats possible and Id really recommend that. 5 or 6 minutes to read and aggregate 10m rows isnt all that bad for reading a flat file (at least as far as I know). This reminds of the old days of FOCUS on the mainframe. If you are doing WHERE tests, you might be able to find an external utility to strip the rows out thats faster than FOCUS (thinking of Sync Sort). Strip the rows out, then report of the resulting file. Link to comment Share on other sites More sharing options...
VisuaLizeFOCUS . Posted February 6, 2021 Author Share Posted February 6, 2021 Loaded all the data in SQL server using Flow in webfocus. Now the performance is better as compared to CSV, FOCUS, XFOCUS and DATEREC sources. Thanks Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now