Ian Dalton Posted January 6, 2022 Posted January 6, 2022 We have some customer data that need to be anonymized after it is copied from our PROD server to DEV/UAT. The data is held in Focus databases - does anyone have any handy routines that might do this ie. for customer names and bank account numbers
Martin Yergeau Posted January 6, 2022 Posted January 6, 2022 May not be exactly what you are looking for, but see chapter 10 TIB_wfwf_8207.28.0_dd_language.pdf (tibco.com)
David Briars Posted January 6, 2022 Posted January 6, 2022 Perhaps in your ETL process you can shuffle characters from your PROD extract before you load to DEV DEFINE FILE GGSALES PRODUCT_SHUFFLED/A16 = PRODUCT; SRCHFOR/A64 = '-''0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'; REPLWITH/A64 = '-''1$3*5@7#9%MN@PQRSTUVWXYZABCDEFGHIJKLmnop*%stuvwxyzabcdefghijkl'; -* Iterate through PRODUCT and shuffle characters. -REPEAT LOOP_DESC FOR &I FROM 1 TO 16 CURRCHAR/A1 = SUBSTR(16, PRODUCT_SHUFFLED, &I, &I, 1, 'A1'); SRCHFORPOS/I3 = POSIT(SRCHFOR, 64, CURRCHAR, 1, 'I3'); REPLWITHC/A1 = SUBSTR(64, REPLWITH, SRCHFORPOS, SRCHFORPOS, 1, 'A1'); PRODUCT_SHUFFLED/A35 = OVRLAY(PRODUCT_SHUFFLED, 35, REPLWITHC, 1, &I, 'A35'); -LOOP_DESC -* END TABLE FILE GGSALES SUM PRODUCT_SHUFFLED AS 'Product' BY PRODUCT_SHUFFLED NOPRINT ON TABLE SET PAGE OFF END
Ian Dalton Posted January 7, 2022 Author Posted January 7, 2022 Thanks David but whilst your ETL routine looks like something that could be of use it would take some time to convert the data. What I am thinking is that we just have various Master File Define statements like what you have suggested to scramble the data as it is used. This can be done on a field by field basis.
David Briars Posted January 7, 2022 Posted January 7, 2022 If it isnt required to store the shuffled data in the DEV/TEST databases, you absolutely can shuffle the data at report time. A centralized location for the DEFINEs is the way to go. Have fun shuffling the deck; sounds like it is time to deal.
Ian Dalton Posted January 7, 2022 Author Posted January 7, 2022 Yes indeed !! Thanks again for your advice.
Pierre CASTILLO 2 Posted January 12, 2022 Posted January 12, 2022 Sorry to add my 2 cents while its already solved. I thought it would be a good place to use DEFINE FUNCTION. Lets try this : DEFINE FUNCTION SCRAMBLE DESCRIPTION Scramble letters in name (INPUT_NAME/A50) OUTPUT_NAME/A50=INPUT_NAME; -* OVRLAY(source_string, length, substring, sublen, position, output); -* SUBSTRING(string, position, length) -* PRDUNI(seed, output) LIST_SCRAMBLE /A62 = ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz; -REPEAT :END_LOOP FOR &POSITION FROM 1 TO 50 STEP 1 INPUT_SCRAMBLE_&POSITION.EVAL /A01 = SUBSTRING(INPUT_NAME, &POSITION, 1); SHIFT_SCRAMBLE_&POSITION.EVAL /I03 = ((62 * PRDUNI(&POSITION, D12.10)) + 0 ) ; OUTPUT_SCRAMBLE_&POSITION.EVAL /A01 = IF INPUT_SCRAMBLE_&POSITION.EVAL EQ THEN ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_&POSITION.EVAL, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_&POSITION.EVAL, 1, &POSITION, A50); -:END_LOOP SCRAMBLE /A50 = OUTPUT_NAME; END TABLE FILE xyz PRINT IN_NAME COMPUTE OUT_NAME/A50 = SCRAMBLE(IN_NAME); END
Charles Morris 2 Posted January 12, 2022 Posted January 12, 2022 If you dont scramble the data as it is loaded to Dev, the raw unscrambled data will still be available to anyone who has edit access to the MFD - sample data or data profiling. Even if the fields are hidden so they dont display in reporting tools and only the DEFINES are exposed, a reasonably clever developer could get to the raw data
Ian Dalton Posted March 31, 2022 Author Posted March 31, 2022 Hi David, just looking into this a little more. Do you think your technique with the DEFINE statements with loop could be added to the the Master File Description
David Beagan Posted April 1, 2022 Posted April 1, 2022 How secure do the names need to be from a resourceful developer With the -REPEAT loop approach, you could possibly use a Master File Profile to get it to work within an MFD. Heres an approach that should be safe from resourceful developers, doesnt use -REPEAT and can be done in the MFD. First create the following fex in baseapp or other app path folder It defines a function, ANON, that changes e, t and a (most common letters) to a random letter. It also changes random letters to other random letters ten times. You could tweak this as you see fit. myfunc.fex DEFINE FUNCTION ANON(STRING/A100) STRING/A100 = LOWER(STRING); STRING/A100 = REPLACE(STRING, 'e', CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, 't', CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, 'a', CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); STRING/A100 = REPLACE(STRING, CHAR(RDUNIF('P9')*26+97), CHAR(RDUNIF('P9')*26+97)); ANON/A100V = LCWORD3(100, STRING, 'A100'); END For the MFD I tried it out with empdata, adding two defines at the bottom: FILENAME=EMPDATA, SUFFIX=FOC, REMARKS='Legacy Metadata Sample: empdata',$ SEGNAME=EMPDATA, SEGTYPE=S1 FIELDNAME=PIN, ALIAS=ID, FORMAT=A9, INDEX=I, $ FIELDNAME=LASTNAME, ALIAS=LN, FORMAT=A15, $ FIELDNAME=FIRSTNAME, ALIAS=FN, FORMAT=A10, $ FIELDNAME=MIDINITIAL, ALIAS=MI, FORMAT=A1, $ FIELDNAME=DIV, ALIAS=CDIV, FORMAT=A4, $ FIELDNAME=DEPT, ALIAS=CDEPT, FORMAT=A20, $ FIELDNAME=JOBCLASS, ALIAS=CJCLAS, FORMAT=A8, $ FIELDNAME=TITLE, ALIAS=CFUNC, FORMAT=A20, $ FIELDNAME=SALARY, ALIAS=CSAL, FORMAT=D12.2M, $ FIELDNAME=HIREDATE, ALIAS=HDAT, FORMAT=YMD, $ $ DEFINE AREA/A13=DECODE DIV (NE 'NORTH EASTERN' SE 'SOUTH EASTERN' CE 'CENTRAL' WE 'WESTERN' CORP 'CORPORATE' ELSE 'INVALID AREA');$ DEFINE FNAME/A20 = DF.myfunc.ANON(FIRSTNAME); $ DEFINE LNAME/A20 = DF.myfunc.ANON(LASTNAME); $ Run a test: TABLE FILE empdata PRINT FIRSTNAME LASTNAME FNAME LNAME END image.png634608 53.5 KB One possible drawback, a value that occurs more than once, e.g. WHITE, doesnt get the same scrambled value. This might be an issue for something line a bank account number. I have an idea to handle that too.
Ian Dalton Posted April 4, 2022 Author Posted April 4, 2022 Hi David, your suggestion looks good and I am playing around with it now. I will get back to you. Just one question though is the technique you describe going to produce different results each time it is run as uses the RDUNIF function I assume it will.
Ian Dalton Posted April 4, 2022 Author Posted April 4, 2022 David, tried your suggestion above and I cant get this to work as REPLACE function isnt recognized. The concept of putting a call to a defined function seems to work ok generally as long as you dont have Dialog Manager commands in the function with REPEAT loops. When I did this it said it couldnt find my fex. I have attached what I tried and couldnt get to work ****master *** SEGMENT=SET_ARNG,SEGTYPE=S1,PARENT=CURRENCY, FIELD=EBNK_ACTY_CD ,ALIAS=EBAT_CDE ,FORMAT=A2 ,$ FIELD=EBANK_CODE ,ALIAS=EBNK_CDE ,FORMAT=A4 ,$ FIELD=ESERV_CD_DB ,ALIAS=ESERV_DB ,FORMAT=A2 ,$ FIELD=ESERV_CD_CR ,ALIAS=ESERV_CR ,FORMAT=A2 ,$ FIELD=EBRANCH_CODE ,ALIAS=EBR_CODE ,FORMAT=A9 ,$ FIELD=EBAC_NUMBER ,ALIAS=EBAC_NO ,FORMAT=A34 ,$ FIELD=EBAC_NAME ,ALIAS=EBAC_NAM ,FORMAT=A50 ,$ FIELD=EADDR_USE_CD ,ALIAS=EADUS_CD ,FORMAT=A2 ,$ FIELD=EVAL_MSTFILE ,ALIAS=VL_MAST ,FORMAT=A1 ,$ FIELD=ESET_ARNGKEY ,ALIAS=SA_KEY ,FORMAT=A17 ,FIELDTYPE=I, $ FIELD=EBANK_CURT ,ALIAS=BANK_CT ,FORMAT=A5 ,FIELDTYPE=I, $ FIELD=EBANK_BR_CT ,ALIAS=BKBR_CT ,FORMAT=A14 ,FIELDTYPE=I, $ FIELD=ECURRENT ,ALIAS=ECURT ,FORMAT=A1 ,$ FIELD=ECHG_DATE ,ALIAS=ECHG ,FORMAT=DMYY ,$ FIELD=ESWIFT_CODE ,ALIAS=SWIFT ,FORMAT=A11 ,$ FIELD=EINTER_BANK ,ALIAS=INTER ,FORMAT=A50 ,$ FIELD=EINTER_SWIFT ,ALIAS=INT_SWFT ,FORMAT=A11 ,$ FIELD=EINTER_CODE ,ALIAS=INT_CODE ,FORMAT=A20 ,$ FIELD=EBEN_ACTY_CD ,ALIAS=BEN_ACTY ,FORMAT=A2 ,$ DEFINE EBAC_NMEANON/A50 = DF.ANONNAM.SCRAMBLE(EBAC_NAME); DESCRIPTION = Calls SCRAMBLE to Anonymise data,$ *** function in focexec ANONNAM.fex*** DEFINE FUNCTION SCRAMBLE DESCRIPTION Scramble letters in name (STRING/A50) OUTPUT_NAME/A50=STRING; -* OVRLAY(source_string, length, substring, sublen, position, output); -* SUBSTRING(string, position, length) -* PRDUNI(seed, output) LIST_SCRAMBLE /A62 = ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz; -REPEAT :END_LOOP FOR &POSITION FROM 1 TO 1 STEP 1 INPUT_SCRAMBLE_&POSITION.EVAL /A01 = SUBSTRING(STRING, &POSITION, 1); SHIFT_SCRAMBLE_&POSITION.EVAL /I03 = ((62 * PRDUNI(&POSITION, D12.10)) + 0 ) ; OUTPUT_SCRAMBLE_&POSITION.EVAL /A01 = IF INPUT_SCRAMBLE_&POSITION.EVAL EQ THEN ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_&POSITION.EVAL, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_&POSITION.EVAL, 1, &POSITION, A50); -:END_LOOP SCRAMBLE /A50 = OUTPUT_NAME; END *** calling code *** DEFINE NSACHOLD -RUN WHENCE ANONNAM FOCEXEC -RUN TABLE FILE NSACHOLD PRINT EBAC_NAME EBAC_NMEANON -*COMPUTE OUT_NAME/A50 = SCRAMBLE(EBAC_NAME); IF RECORDLIMIT EQ 200 END
David Beagan Posted April 4, 2022 Posted April 4, 2022 I wasnt expecting the DEFINE FUNCTION to work with Dialogue Manager. I was able to get it to work this way, eliminating the -REPEAT. Also, it works a bit better if the final assignment of SCRAMBLE is varchar. DEFINE FUNCTION SCRAMBLE DESCRIPTION 'Scramble letters in name' (STRING/A50) OUTPUT_NAME/A50 = STRING; LIST_SCRAMBLE /A62 = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789abcdefghijklmnopqrstuvwxyz'; INPUT_SCRAMBLE_1 /A01 = SUBSTRING(STRING, 1, 1); SHIFT_SCRAMBLE_1 /I03 = ((62 * PRDUNI(1, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_1 /A01 = IF INPUT_SCRAMBLE_1 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_1, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_1, 1, 1, 'A50'); INPUT_SCRAMBLE_2 /A01 = SUBSTRING(STRING, 2, 1); SHIFT_SCRAMBLE_2 /I03 = ((62 * PRDUNI(2, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_2 /A01 = IF INPUT_SCRAMBLE_2 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_2, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_2, 1, 2, 'A50'); INPUT_SCRAMBLE_3 /A01 = SUBSTRING(STRING, 3, 1); SHIFT_SCRAMBLE_3 /I03 = ((62 * PRDUNI(3, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_3 /A01 = IF INPUT_SCRAMBLE_3 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_3, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_3, 1, 3, 'A50'); INPUT_SCRAMBLE_4 /A01 = SUBSTRING(STRING, 4, 1); SHIFT_SCRAMBLE_4 /I03 = ((62 * PRDUNI(4, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_4 /A01 = IF INPUT_SCRAMBLE_4 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_4, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_4, 1, 4, 'A50'); INPUT_SCRAMBLE_5 /A01 = SUBSTRING(STRING, 5, 1); SHIFT_SCRAMBLE_5 /I03 = ((62 * PRDUNI(5, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_5 /A01 = IF INPUT_SCRAMBLE_5 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_5, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_5, 1, 5, 'A50'); INPUT_SCRAMBLE_6 /A01 = SUBSTRING(STRING, 6, 1); SHIFT_SCRAMBLE_6 /I03 = ((62 * PRDUNI(6, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_6 /A01 = IF INPUT_SCRAMBLE_6 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_6, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_6, 1, 6, 'A50'); INPUT_SCRAMBLE_7 /A01 = SUBSTRING(STRING, 7, 1); SHIFT_SCRAMBLE_7 /I03 = ((62 * PRDUNI(7, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_7 /A01 = IF INPUT_SCRAMBLE_7 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_7, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_7, 1, 7, 'A50'); INPUT_SCRAMBLE_8 /A01 = SUBSTRING(STRING, 8, 1); SHIFT_SCRAMBLE_8 /I03 = ((62 * PRDUNI(8, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_8 /A01 = IF INPUT_SCRAMBLE_8 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_8, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_8, 1, 8, 'A50'); INPUT_SCRAMBLE_9 /A01 = SUBSTRING(STRING, 9, 1); SHIFT_SCRAMBLE_9 /I03 = ((62 * PRDUNI(9, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_9 /A01 = IF INPUT_SCRAMBLE_9 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_9, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_9, 1, 9, 'A50'); INPUT_SCRAMBLE_10 /A01 = SUBSTRING(STRING, 10, 1); SHIFT_SCRAMBLE_10 /I03 = ((62 * PRDUNI(10, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_10 /A01 = IF INPUT_SCRAMBLE_10 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_10, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_10, 1, 10, 'A50'); INPUT_SCRAMBLE_11 /A01 = SUBSTRING(STRING, 11, 1); SHIFT_SCRAMBLE_11 /I03 = ((62 * PRDUNI(11, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_11 /A01 = IF INPUT_SCRAMBLE_11 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_11, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_11, 1, 11, 'A50'); INPUT_SCRAMBLE_12 /A01 = SUBSTRING(STRING, 12, 1); SHIFT_SCRAMBLE_12 /I03 = ((62 * PRDUNI(12, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_12 /A01 = IF INPUT_SCRAMBLE_12 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_12, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_12, 1, 12, 'A50'); INPUT_SCRAMBLE_13 /A01 = SUBSTRING(STRING, 13, 1); SHIFT_SCRAMBLE_13 /I03 = ((62 * PRDUNI(13, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_13 /A01 = IF INPUT_SCRAMBLE_13 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_13, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_13, 1, 13, 'A50'); INPUT_SCRAMBLE_14 /A01 = SUBSTRING(STRING, 14, 1); SHIFT_SCRAMBLE_14 /I03 = ((62 * PRDUNI(14, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_14 /A01 = IF INPUT_SCRAMBLE_14 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_14, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_14, 1, 14, 'A50'); INPUT_SCRAMBLE_15 /A01 = SUBSTRING(STRING, 15, 1); SHIFT_SCRAMBLE_15 /I03 = ((62 * PRDUNI(15, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_15 /A01 = IF INPUT_SCRAMBLE_15 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_15, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_15, 1, 15, 'A50'); INPUT_SCRAMBLE_16 /A01 = SUBSTRING(STRING, 16, 1); SHIFT_SCRAMBLE_16 /I03 = ((62 * PRDUNI(16, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_16 /A01 = IF INPUT_SCRAMBLE_16 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_16, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_16, 1, 16, 'A50'); INPUT_SCRAMBLE_17 /A01 = SUBSTRING(STRING, 17, 1); SHIFT_SCRAMBLE_17 /I03 = ((62 * PRDUNI(17, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_17 /A01 = IF INPUT_SCRAMBLE_17 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_17, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_17, 1, 17, 'A50'); INPUT_SCRAMBLE_18 /A01 = SUBSTRING(STRING, 18, 1); SHIFT_SCRAMBLE_18 /I03 = ((62 * PRDUNI(18, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_18 /A01 = IF INPUT_SCRAMBLE_18 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_18, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_18, 1, 18, 'A50'); INPUT_SCRAMBLE_19 /A01 = SUBSTRING(STRING, 19, 1); SHIFT_SCRAMBLE_19 /I03 = ((62 * PRDUNI(19, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_19 /A01 = IF INPUT_SCRAMBLE_19 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_19, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_19, 1, 19, 'A50'); INPUT_SCRAMBLE_20 /A01 = SUBSTRING(STRING, 20, 1); SHIFT_SCRAMBLE_20 /I03 = ((62 * PRDUNI(20, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_20 /A01 = IF INPUT_SCRAMBLE_20 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_20, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_20, 1, 20, 'A50'); INPUT_SCRAMBLE_21 /A01 = SUBSTRING(STRING, 21, 1); SHIFT_SCRAMBLE_21 /I03 = ((62 * PRDUNI(21, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_21 /A01 = IF INPUT_SCRAMBLE_21 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_21, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_21, 1, 21, 'A50'); INPUT_SCRAMBLE_22 /A01 = SUBSTRING(STRING, 22, 1); SHIFT_SCRAMBLE_22 /I03 = ((62 * PRDUNI(22, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_22 /A01 = IF INPUT_SCRAMBLE_22 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_22, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_22, 1, 22, 'A50'); INPUT_SCRAMBLE_23 /A01 = SUBSTRING(STRING, 23, 1); SHIFT_SCRAMBLE_23 /I03 = ((62 * PRDUNI(23, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_23 /A01 = IF INPUT_SCRAMBLE_23 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_23, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_23, 1, 23, 'A50'); INPUT_SCRAMBLE_24 /A01 = SUBSTRING(STRING, 24, 1); SHIFT_SCRAMBLE_24 /I03 = ((62 * PRDUNI(24, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_24 /A01 = IF INPUT_SCRAMBLE_24 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_24, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_24, 1, 24, 'A50'); INPUT_SCRAMBLE_25 /A01 = SUBSTRING(STRING, 25, 1); SHIFT_SCRAMBLE_25 /I03 = ((62 * PRDUNI(25, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_25 /A01 = IF INPUT_SCRAMBLE_25 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_25, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_25, 1, 25, 'A50'); INPUT_SCRAMBLE_26 /A01 = SUBSTRING(STRING, 26, 1); SHIFT_SCRAMBLE_26 /I03 = ((62 * PRDUNI(26, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_26 /A01 = IF INPUT_SCRAMBLE_26 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_26, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_26, 1, 26, 'A50'); INPUT_SCRAMBLE_27 /A01 = SUBSTRING(STRING, 27, 1); SHIFT_SCRAMBLE_27 /I03 = ((62 * PRDUNI(27, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_27 /A01 = IF INPUT_SCRAMBLE_27 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_27, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_27, 1, 27, 'A50'); INPUT_SCRAMBLE_28 /A01 = SUBSTRING(STRING, 28, 1); SHIFT_SCRAMBLE_28 /I03 = ((62 * PRDUNI(28, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_28 /A01 = IF INPUT_SCRAMBLE_28 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_28, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_28, 1, 28, 'A50'); INPUT_SCRAMBLE_29 /A01 = SUBSTRING(STRING, 29, 1); SHIFT_SCRAMBLE_29 /I03 = ((62 * PRDUNI(29, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_29 /A01 = IF INPUT_SCRAMBLE_29 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_29, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_29, 1, 29, 'A50'); INPUT_SCRAMBLE_30 /A01 = SUBSTRING(STRING, 30, 1); SHIFT_SCRAMBLE_30 /I03 = ((62 * PRDUNI(30, 'D12.10')) + 0 ) ; OUTPUT_SCRAMBLE_30 /A01 = IF INPUT_SCRAMBLE_30 EQ ' ' THEN ' ' ELSE SUBSTRING(LIST_SCRAMBLE, SHIFT_SCRAMBLE_30, 1); OUTPUT_NAME /A50 = OVRLAY(OUTPUT_NAME, 50, OUTPUT_SCRAMBLE_30, 1, 30, 'A50'); INPUT_SCRAMBLE_31 /A01 = SUBSTRING
Ian Dalton Posted April 5, 2022 Author Posted April 5, 2022 Thanks so much David for your time on this one - I will try out and get back to you. Looks like DM is a no go in a DEFINE FUNCTION.
Ian Dalton Posted April 6, 2022 Author Posted April 6, 2022 Hi David, your suggestion worked well - just a shame the REPEAT DM command doesnt work in a DEFINE FUNCTION.
Jean-Claude Carriere 3 Posted April 10, 2022 Posted April 10, 2022 While using random functions to anonymize data might be required it is sometime not necessary to go that far and using a function that is not a bijection and cannot be inverted is enough Here is one I use sometime that would convert SPEEDY GONZALEZ to SSSSSS SSSSSSSS which has the advantage of keeping sort order on the first letter. The function assume you only only have upper case caracters (otherwise you may want to replace a and 9 for numeric) DEFINE FUNCTION MASKNAME( LIBELLE/A120 ) PATLIB/A120=PATTERNS(LIBELLE); FCH/A1=SUBSTRING(LIBELLE,1,1); MASKNAME/A120=REPLACE(PATLIB, A, FCH ); END DEFINE FILE MYLASTER MSK_LABEL/A120=MASKNAME( LABEL ); END This is something that canbe translated to SQL and gives REPLACE( REGEXP_REPLACE( REGEXP_REPLACE( REGEXP_REPLACE( REGEXP_REPLACE( T1.LABEL, [[:digit:]], 9 ), [[:upper:]], A ), [[:lower:]], a ), [[:cntrl:]], X ), A, SUBSTRING(T1.LABEL, CAST(1 AS INTEGER), 1) ) AS MSK_LABEL,
Ian Dalton Posted April 11, 2022 Author Posted April 11, 2022 Looks like a good solution too Jean-Claude. The Lloyds IT group directive is to now anonymise data on loading into our DEV environment (when refreshing from PROD) rather than on the fly every time the database is accessed for reports.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now