Character Encoding Issue When Uploading CSV Files

Hi, Retool community!
I'm encountering an issue related to character encoding and would appreciate some assistance.
I have file input component where users can upload csv files.
Screenshot 2023-08-16 at 14.37.47

For some CSV files, all non-English letters appear correctly after being uploaded, but for others, they are displayed as "?". Both files are UTF-8.

This is how uploaded data looks like when non-english letters are not shown:
Screenshot 2023-08-16 at 15.04.07

The source of this table is:

{{ transformedData.value.map(item => {
  const id = `${moment().format('YYYY-MM-DDTHH:mm:ss.SSSSSS')}-${Object.values(item).join('-')}`;
  return { ...item, date_added: moment().format('YYYY-MM-DDTHH:mm:ss.SSSSSS') };
}) }}

transformedData code:

const tableData = {{ fileInput1.parsedValue[0] }};
const columnNames = {{columnNames.value}};

//get mapped column names from textinput component
const columnMappings = {{columnMappings}};

//create an object where the keys are old column names and the values are new column names
const columnMappingObject = Object.fromEntries(columnNames.map((col, i) => [col, columnMappings[i].value]));

//maps column names in each row to new name
const keyMapper = (row) => {
  const newRow = {};
  for (const [key, value] of Object.entries(row)) {
    const newKey = columnMappingObject[key];
    if (newKey) {
      newRow[newKey] = value;
    }
  }
  return newRow;
};

const uniqueRowsSet = new Set();
const uniqueRows = tableData.map(keyMapper).filter(row => {
  const rowString = JSON.stringify(row);
  if (uniqueRowsSet.has(rowString)) {
    return false; // Skip this row, it's a duplicate
  }
  uniqueRowsSet.add(rowString);
  return true;
});

return uniqueRows;

I've noticed that when I open the problematic file in VS Code, these non-English letters are marked and a message pops up:

This leads me to believe that there might be something distinct about the file causing this behavior. I'm wondering if there's a way to resolve this issue?
If anyone has insight into what might be causing this discrepancy and how to address it, I would greatly appreciate your assistance. Thank you!

Hello, It seem work for my UTF-8 sheet, could you share you csv file?

or maybe you can try Papa.parse

{{Papa.parse(atob(filepicker1.file.data),{header:true, delimiter: ";"}).data}}

https://www.papaparse.com/docs#config

you can set the encoding of config.

Hi @AnsonHwang, I've sent you a message with csv file that does not work. Thanks for your help with this case.

Yes, I receive it. There seem it has a bug with Retool to deal with UTF-8 file.
but the docs of fileUtils is not explain how to use parameter of options

Here is my solution - using Papa

let result = Papa.parse(atob(fileButton1.value[0]),{header:true, delimiter: ","}).data;

return result;

@Kabirdas hello, can you check this topic? thanks.

Thanks, that really works, but I'm not sure how to implement this in my original JS code?

const tableData = {{ fileInput1.parsedValue[0] }};
const columnNames = {{columnNames.value}};

//get mapped column names from textinput component
const columnMappings = {{columnMappings}};

//create an object where the keys are old column names and the values are new column names
const columnMappingObject = Object.fromEntries(columnNames.map((col, i) => [col, columnMappings[i].value]));

//maps column names in each row to new name
const keyMapper = (row) => {
  const newRow = {};
  for (const [key, value] of Object.entries(row)) {
    const newKey = columnMappingObject[key];
    if (newKey) {
      newRow[newKey] = value;
    }
  }
  return newRow;
};

const uniqueRowsSet = new Set();
const uniqueRows = tableData.map(keyMapper).filter(row => {
  const rowString = JSON.stringify(row);
  if (uniqueRowsSet.has(rowString)) {
    return false; // Skip this row, it's a duplicate
  }
  uniqueRowsSet.add(rowString);
  return true;
});

return uniqueRows;

I'm not a master in Java script so no matter how I try to implement this to my code I get error "fileInput1 is not defined".

Hey folks! Sorry for the late reply here. If you're seeing the � character in the CSV itself, i.e. the character you want has already been encoded as U+FFFD before the file gets to Retool then there may not be much else to do :confused: Where is the CSV coming from originally @mondob?

Hi @Kabirdas, I'm not sure what is the original source of files, they might have been exported from different sources by my client.
When I open the original file I can see it shown normally, the "?" is shown only when uploaded to Retool.
Do I understand right that there is no solution in Retoll right now for such files? The only option would be to resave files manually and then upload them to Retool?

Oh I see, my mistake, I misread the thread. If the "?" isn't present in the CSV and the only issue you're having is how to incorporate Anson's script into your original code then it should be doable.

Does the code you posted work? Or was it already throwing the fileInput1 is not defined error?

If it was working, you might try using something like this to incorporate Anson's solution:

const rawData = {{ fileInput1.value[0] }};
const tableData = Papa.parse(atob(rawData),{header:true, delimiter: ","}).data;

For now, the only option in the fileUtils.parseCSV is the header boolean @AnsonHwang (the docs should be updated to reflect this). As of writing this, it uses Papa Parse under the hood though so you should be able to get very similar functionality using that library directly as you mentioned.