Product data feeds come in various formats.
The most popular feed formats are:
Each of these can effectively catalog product data in slightly different ways.
This site puts those different formats into perspective. With it you can convert data into a number of different data formats.
This example table displays a list of customers, each with four attributes: id, name, age, and gender.
The following examples will show you what this data looks like in each popular feed format.
CSV (Comma-Separated Values) files, as the name implies, separate values with commas or other delimiters.
CSV format is largely self-explanatory, but something you might want to know about CSV files is: What if the value itself has a comma in it?
A system, if it didn’t know otherwise, would parse that data as another value.
To work around that, you can wrap values in quotation marks, which act as escape characters — characters that invoke an alternative interpretation on the following characters in a character sequence.
Here’s an example where all of the data is wrapped in quotation marks.
While functional, wrapping every value is excessive. Values that don’t contain commas don’t need to be wrapped.
CSV files could also use semicolon (;), pipe (|), or other characters as delimiters to separate values — commas are just the most common. Semicolon delimiter use is common in Europe because their prices use a comma as a value rather than a decimal.
Then there are tab-delimited plain text files ending in TXT.
This format helps solve the problem of the comma being parsed, as every new row in that data file is represented by a new row character.
XML (Extensible Markup Language) files are visually similar to HTML (Hypertext Markup Language).
The formatting to an XML file depends on a schema, or structured data, which is used to more clearly provide information to the recipient system.
For example, Walmart relies on a schema that they use to process data in a certain way. This schema is different from what Amazon, Google, and others use.
This format also introduces the concept of a minified and unminified version.
- Unminified: human-oriented file that includes formatting that makes it easy to read.
- Minified: computer-oriented file without easy-to-read formatting. Minified files take up less space.
Those carrot/alligator characters on each line are used to define elements and those elements contain definitions of data.
XML files give more context without that information actually being values in the data.
This format was very popular in the 90s and is still commonly used, but it’s becoming less popular than JSON.
This format is similar to XML files in that it can be minified or unminified and it also works off a schema.
Here there’s a list of customers and each one is represented by properties and values — a name and key value pair.
JSON and XML allow for more organized, standardized, and more complex data compared to a CSV or TXT file.
Lastly, we have XLS (Microsoft Excel Spreadsheet) files.
XLS files are different because instead of being plain text files, they are binary files. They contain complex data like images, styling format formulas, etc.
This example contains the same data established in the original table but it also contains color and font styling information, as well as a chart.
Because XLS files can contain all of this information, this format is commonly used for business cases but not so much as a product feed format.
While we’re on the topic of feed formats, this would be a good time to introduce the concept of file compression.
Channels often may accept or favor compressed files. When you’re dealing with big files, compression can really help with transmission speed. Some compressed files can be one-tenth the size of their uncompressed form.
You’re going to run into instances where you may need to compress your file(s) or where it may be a helpful option.
Continue to Product Feed Basics: Product Attributes