Erin Trotter Posted March 9, 2023 Share Posted March 9, 2023 I have a project where I have to hold quite a bit of data in hold files for later reference. Some of the files are rather large. I'm trying to find out what hold format is best for this data to reduce file size on disk Link to comment Share on other sites More sharing options...
Martin Yergeau Posted March 9, 2023 Share Posted March 9, 2023 No format is really better then another one regarding file size on disk.When you HOLD on disk, at the end it's like you are holding a text file.The difference about format it's their size & features limitation.Per example HOLD FORMAT FOCUS or HOLD FORMAT XFOCUS can be INDEXedXFOCUS can manage more data than FOCUS where FOCUS can hold up to 2M rows (if I remember well) and XFOCUS doesn't have that limitation.If disk space is an issue for you, depending of what is available from your environment, you can HOLD as a SQL tableE.g. FORMAT SQLMSS <-- For MS-SQL DB Link to comment Share on other sites More sharing options...
David Beagan Posted March 9, 2023 Share Posted March 9, 2023 How big is "rather large"? Are you concerned about running out of space? Link to comment Share on other sites More sharing options...
Erin Trotter Posted March 9, 2023 Author Share Posted March 9, 2023 In this instance, I've had one file that consumed over 50 GB of space... I'm trying to figure out ways to break this file up into smaller pieces. I've been holding them as binary, but was not sure if that was the best way.And yes, I am concerned about running out of space. It is already starting to happen. So, before I request more space, I just wanted to see what all of my options are for preserving as much as I can Link to comment Share on other sites More sharing options...
David Beagan Posted March 9, 2023 Share Posted March 9, 2023 Ok Erin, yes, I see where you are coming from.Have you considered Hyperstage? Seems like we saw significant space savings when we tried it last year and it was faster. Link to comment Share on other sites More sharing options...
Erin Trotter Posted March 9, 2023 Author Share Posted March 9, 2023 No, we have not at this point. This reporting structure is a new one that we are just starting out in because we are switching from SQL server to Snowflake and trying to keep our querying costs low by not hitting snowflake every time one of the many users run a report. We figured if we run a report that loads all of the data once a day, it will help us save in that way but then we are running into this on the other side Link to comment Share on other sites More sharing options...
Martin Yergeau Posted March 9, 2023 Share Posted March 9, 2023 "We figured if we run a report that loads all of the data once a day, it will help us save in that way"But my question still remain : can you hold as SQL table instead of on disk ?I am doing that a lot of time : holding once a day data into a unique SQL table to avoid querring the snowflake all the time at each request Link to comment Share on other sites More sharing options...
Erin Trotter Posted March 9, 2023 Author Share Posted March 9, 2023 So the deal with that is my boss would like to do away with SQL, so we are trying to move completely away from it Link to comment Share on other sites More sharing options...
David Beagan Posted March 9, 2023 Share Posted March 9, 2023 Do away with SQL databases? Or do you just mean coding in SQL? Link to comment Share on other sites More sharing options...
Martin Yergeau Posted March 9, 2023 Share Posted March 9, 2023 Got it.But you may encounter a lot of limitation with a 50Gb data file.With that size you'll be better to use something such as a DB or as suggested by David, Hyperstage... Link to comment Share on other sites More sharing options...
Erin Trotter Posted March 9, 2023 Author Share Posted March 9, 2023 No. I mean SQL databases as in migrate everything that is on SQL into Snowflake and shut the servers down. Link to comment Share on other sites More sharing options...
Martin Yergeau Posted March 9, 2023 Share Posted March 9, 2023 "migrate everything that is on SQL into Snowflake"Can you not hold your data into Snowflake (cloud data hosting) ? Link to comment Share on other sites More sharing options...
Erin Trotter Posted March 9, 2023 Author Share Posted March 9, 2023 I'm not sure about that one. I will have to look into what that means exactly Link to comment Share on other sites More sharing options...
David Beagan Posted March 10, 2023 Share Posted March 10, 2023 If the version of the reporting server you have installed includes Hyperstage you might want to try this out on your data. The ggsales.foc file is about 820 KB. I created a Hyperstage database of this ggsales data. ENGINE INT SET TABLESAMPLING_PERCENTAGEENGINE INT SET FCA ONENGINE SQLHYPG SET BULKLOAD ONENGINE SQLHYPG SET BULKCHECK 1000 APP HOLD baseapp TABLE FILE ggsales PRINT * ON TABLE HOLD AS gg_hypg FORMAT SQLHYPG ENDThe APP HOLD directs the Hyperstage table's metadata to be stored in the application folder. Then you can TABLE FILE gg_hypgto report off of this data. The data is stored under the ibiHyperstagePGib_datawebfocuspublicgg_hypg.bth folder. When I right click this folder and select Properties the Size on disk is 68 KB. That is about a twelve-fold reduction in size. This reduction would probably vary depending on the inherent reduntancy in your data.Note, without the ENGINE SQLHYPG SET BULKLOAD ONI found that the creation of the file was exteemly slow. Link to comment Share on other sites More sharing options...
Erin Trotter Posted March 10, 2023 Author Share Posted March 10, 2023 Thanks for the suggestion. I will look into this! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now