Seems unusual at first to claim JSON can replace CSV because both file types look
entirely different. CSV looks relatively plain and unstructured and is often viewed inside a
spreadsheet app. JSON looks highly structured and well organized viewed inside a webapp.
CSV files contain multiple lines formed from values separated by commas. It’s rigid.
CSV expects every nth entry from each line represents the same thing. Think of it as table
where each row has all entries for all defined columns.
JSON files contain one or more data structures containing a collection of named properties and
values. In other words, well-structured objects of arbitrary complexity.
JSONL takes the best ideas of both file formats. Imagine a data file forming JSON into
lines. It looks like a file that’s rows and rows of JavaScript objects.
JSONL allows a line to have more complicated properties like
nested arrays and object. You won't see the equivalent casually done in a CSV file.
JSONL allows for flexibility, resilience, extensibility, and retains a convenient text-based format.
Easy for humans to read and write by hand in popular programming editors.
You can imagine it’s incredibly useful for tools that need to process streams of records.
You’ll find JSONL to be a simple, line-separated format that’s easily
readable and writable by command line tools and scripting languages.
JSONL is new to me. I got lucky learning about it doing R&D with OpenAI APIs. They use
JSONL formatted training data when fine-tuning the ChatGPT LLM.
I'll actively look for opportunities to this file format in future projects. Reach out to me on Twitter and let me know of your success. Let’s do
something awesome!