
Lilac Garden
Blazing fast dataset transforms for LLMs



About | Details |
---|---|
Name: | Lilac Garden |
Submited By: | Eino Klein |
Release Date | 1 year ago |
Website | Visit Website |
Category | Open Source Data Science |
Lilac is an open-source tool that enables data and AI practitioners improve their products by improving their data. Lilac Garden is a hosted service that accelerates common dataset transformations, like clustering, signal computation, and data edits with LLMs.
I've been beta-testing Garden and let me tell you - it really changes the way you think and work with data! Being able to compute embeddings and clusters in ~1 minute means that you try lots of new ways to slice and dice your data, and you find a lot more interesting subsets that way.
1 year ago
Congratulations on your launch! I'm glad I came across your tool, it's going to be very helpful for my current research. Easy upvote for me. I'm eager to take a close look. Since, I'm working on highly multilingual data, I see you have lang detection signals which is great. Is there also a possibility to find semantically similar data (across say, a column)? Or maybe the possibility to use custom models that can aid in such a task
1 year ago
Pandas was transformational in data processing pre-LLMs, and it seems like in an LLM world, the way we process, and refactor training data should be seriously re-examined. @nsthorat what's a typical use case? e.g. how does Lilac help with, say, clustering and editing?
1 year ago