Manipulating uploaded Excel data

Hi, I am new to Retool and want to provide a way for users to upload Excel with data which has to be then stored in the DB. I am quite happy to see that you have parsed value directly available. Here are some challenges I might face:

  1. The data does not always start at row 0 - some rows have to be ignored.
  2. The Excel sheet is typically a dump from ERP which contains 50+ columns and we are interested in only a few columns - how do I customize the parse function to pick only a few columns?
  3. sometimes, I might want to run some validation on the Excel and give error message to user - for example, if they submit a file with a wrong set of columns. How do I handle errors in the process?
  4. sometimes, I want to filter out certain rows and upload the rest to the DB. I see that the workflow is available to do the ETL/data manipulation. But can I pair it with the file upload tool?
  5. Does the file uploader support XLSB files which offer better compression for large file size, as the input files are quite large?
  6. If the user is uploading the same file again, I want a way to let him know that - how do I do that?

So far, we have been doing this in Python/Pandas and we were thinking of using Streamlit. But Retools looks like a great productivity booster for us. So, trying it out - i have done a google search and tried to read what I could find, but could not find enough guidance. So, asking help here - so, any pointers would be appreciated.

Hey @Siraj_Samsudeen!

Retool apps don't support Python outside of Workflows at the moment, though it's something that has been requested before and is on the dev team's radar. You can send your data to a Workflow from your app for further manipulation but that might add too much overhead since you're working with large files. For now, if it's comfortable for you, the best way might be to add additional parsing logic using JavaScript.

There are some built-in libraries like lodash and Papa Parse that might be helpful in working with your data. 1-4 sound doable with vanilla JS or lodash but there are also other third-party libraries like sheetjs that are built specifically for managing xlsx files and might help with parsing xlsb files. I can't say for sure that it'll work but Retool stores unparsed file data as a base64 string which is readable by sheetjs. There's documentation on how to import third-party libraries here!

If JavaScript is a hard no and your files are to big to send in a request to a Workflow you can also try using SQL to grab certain rows/columns from the parsed value with a Query JSON with SQL query. That doesn't give you all the functionality of 1-5 but can be convenient to work with.

As for 6... it's possible to temporarily store data in your app (docs here) so you could conceivably pull the file name from yourFileInput.files[0], store it, and compare it to the most recently uploaded file in an event handler.
Screenshot 2023-09-07 at 6.35.14 PM

Hopefully that gives you some ideas! It looks like @AnsonHwang may also have some useful suggestions here!

Here is docs of prase XLSX

retool uses sheetjs as their excel parse technical solution behind it

You can click here in docs to see what parameter you can use.

The range is what you need to skip a some row

Why not filtered it with SQL?
Yes, you also can do it after prasing it, you can filtered it with map function of array if you use js. or you can pass the data to workflow, in that you can use python to filter it.

You can compare the columns and show differrent as errors?

Of course

You can parse excel in query1, and reference in query2 which is a workflow query, then you can pass data to workflow.

It seem it was support in SheetJs, you can try it.

I think you can store the hash value and name of every file. and compare them to judge if they are same

1 Like