Part Data Integration and Structural Best Practices
Sending part data, receiving part data, dealing with millions of parts and hundreds of millions of attributes, importing STEP files & spec sheets, fetching price & availability, and getting all of that data organized either for export or for import to internal teams (purchasing, marketing, etc.), is a daunting task. It’s one that usually falls to a few brave souls within an organization, and those people usually bear the weight of brute-force processing of that data to make it available on demand.
However, that data is critical to business automation, and sales automation in particular. It is the cornerstone of all future automation, API enablement, Blockchain support, supply chain integration, etc. No matter what you call it or what technology you use, this part data organization effort is the absolute first step to gaining efficiency by means of automation down the road.
Here at Orbweaver, we’ve learned some things that broadly apply to everyone, and using that knowledge to help everyone get started has a broad benefit to the supply chain. To help get you started on your integration & automation journey, here are our best practices for managing large-scale imports, exports, and organization of Part Data.
Types of Part Data
When it comes to processing part data, either inbound or outbound, it’s usually best to think about the data in two key categories, and to build integrations and storage to accommodate both.
First is data that is updated frequently. Data that is updated frequently typically needs to be sent or received daily (or optimally, in real time), and usually consists of Pricing, QOH, ATO, CTO, Lead Times, etc. This data is typically communicated as frequently as possible, or optimally, on-demand. Some companies support this type of data lookup in real time via API (and the industry is certainly trending in this direction), but many companies transmit this data via a once-daily update via some other mechanism (more on that later).
When it comes to processing this data, best practice is to store the more frequently updated data separately, in a ‘header’ type of record, which would contain a Part Number as a primary key, and then some basic identifying information (typically name, short description, long description, basic physical characteristics, a spec sheet, and of course, the more fluid pieces of data: any price, availability information, and any compliance or lot data (if available). This data can be updated regularly, as it sits on top of the other part attribute data and acts as a quick-lookup ‘parent’ or ‘header’ record to all of the other detailed part characteristic data. We’ll get into this in more detail later on also.
Second is data that is static (or relatively so). Data that is updated less frequently is typically part specification data. Electrical characteristics, categories, physical attributes, STEP files, images, data sheets, specification sheets, etc. – all of this data is generally static, and usually only needs to be transmitted infrequently. Sometimes it’s transmitted monthly, sometimes on-demand, and sometimes only when changes are made. This data is typically sent via flat file format (EDI 832, CSV, XLS, etc.), and is usually ingested en masse into a processing system, or sometimes even just parked in a shared drive to be accessed by sales and marketing teams manually, and as necessary.
Types of Data Intake
Data is typically transmitted in one of two ways:
First, data can be transmitted periodically, also known as asynchronous integration. As mentioned above, this style of integration has value and efficiency when loading large amounts of relatively static data, but does not effectively support real-time or transactional exchanges of data between companies (like Price, Availability, POs, COs, etc.). EDI has attempted to use this mechanism to automate all aspects of the purchasing cycle (and in fairness, it was born out of a time when APIs were not readily available), but with limited success. EDI remains a key player in this mode of data transport however, along with CSV and XLS files, and direct database exports.
Second, data can be integrated between systems or companies in real-time, also known as synchronous integration. This is usually accomplished by API, but can also be accomplished by CXML or Blockchain. There are a few other approaches to this, but these are the primary mechanisms for real-time integration today, and most integrations fall under the ‘API’ designation, which is a broad, all-encompassing term for structured integrations between systems. Real-time data tends to be data that is highly subject to change, or data that needs to be validated by the receiving or transmitting system in real-time. Examples are Price, Availability, Purchase/Sales Orders, Change Orders, etc. The true value of a real-time integration is that two systems can have a real-time conversation with one another via API, in much the same way two humans would.
For example:
System 1: Hello, do you have Part ABC123?
System 2: Yes, I do.
System 1: How many do you have in stock now, what is your Lead Time, and what is your price for 10,000 of those?
System 2: I have 23,000 on the shelf, my lead time 16 weeks, and at your stated quantity, the cost is .0024 USD per part.
System 1: Very good, I would like to purchase 10,000 of those to be delivered tomorrow.
System 2: I’m sorry, we don’t have overnight shipping since that’s coming from my warehouse in Ontario. I can get it to you by next Monday though, does that work for you?
System 1: Yes, thank you, here is the PO.
System 2: I have received your PO and confirm the details of the order.
System 2: Here is your tracking information, and I’ll send you your invoice shortly.
You can see here the real value of the real-time interaction between systems. With the ability to converse via API, those two systems performed several man-hours of work fetching, organizing, transmitting, and negotiating, and they completed that work in seconds.
Structuring the Data
As referenced above, the key to structuring data is building a header set of data that all other data relates to. This allows pieces of data to be transmitted, updated, and stored independently of one another, and that provides for long-term flexibility and scalability. Per the discussion above, this is a common practice, and your long-term part data storage will need to accommodate that.
Here at Orbweaver, we tend to structure our data in this way:
Header Data (1 parent record w/fluid data + key identifiable Part Number data)
- Images/Media
(0-n child images, videos, or other digital marketing content) - Documents
(0-n child STEP files, specs, drawings, or other technical content) - Categories
(0-n child categories specifically applicable to this part) - Compliance/Change Data
(0-n child records) - Attributes
(0-n child attribute + designated attribute category)
This structure allows each supplier of data to provide data in a way that is most useful for them, but also allows for the flexible intake and storage of that data. Some providers may give you hundreds of attributes, while others may provide a smaller set of data. This high-level structure allows you to capture all data relevant to the header record and flexibly store and retrieve it without confining presuppositions about the structure of incoming data.
Integrating to your Systems
As with the Synchronous/Asynchronous conversation above, the type of integration you would need in place to fetch/store part data will vary somewhat depending on your type of integration.
The most common current practice is to pull a price file daily and a part attribute file monthly or annually (or whenever updated). This is an excellent place to start, and it does have its merits. As discussed above, pulling large volumes of data in asynchronously is often preferable to minimize the sheer volume of data being transmitted between two systems.
As the pace of the industry accelerates, however, current best practice indicates that the usage of an API for the fetching of all part data or at least for Price/Availability (real-time) data is imperative. Consider the application of being able to pull real-time ATO or QOH from a supplier and embedding that in your Buyer’s toolkit so they can validate in real-time their ability to place that PO. The reliability of that transaction is now significantly higher. Likewise, this data can be embedded in RFQ, CO, and all other data inflows and outflows with your supply chain.