-
Notifications
You must be signed in to change notification settings - Fork 147
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
More and more frequently users are reaching for LLMs to generate code and solve problems. We should add to our repository instructions to aid the LLMs in building datafusion-python code.
Describe the solution you'd like
According to my very quick research into the topic, a llms.txt file seems to be one emerging standard. I know some repositories have opted for a CLAUDE.md file as well. I think part of this issue will be to investigate what the emerging standards are and what we need to do to ensure the major agents out there are able to use these instructions.
Additionally, since we will have users coming from different communities it is probably helpful to have LLM oriented instructions for how to rewrite queries from other dataframe APIs into DataFusion.
We should cover:
- Spark
- Pandas
- Polars
Additionally there are probably recommendations for how we update our docstrings to make sure they are easily usable by the LLMs.