How to model parts equivalence in a database

I am developing an auto parts quoting and pricing application for a company that sells auto parts. The client wants the application to handle their internal daily operations, such as calculating parts prices, creating, and managing quotes. To support these requirements, I have designed a database with more than 20 tables.

One of the key challenges the client wants the application to address is a common issue in the auto spare parts industry: parts equivalency. Equivalent parts are parts that can be used interchangeably. For example, if a part with part number (a unique identifier for auto parts across the industry) PN0001 is equivalent to parts with part numbers PN0002, PN0003, and PN0004, then all these parts can replace one another.

I have attached the screenshot of some tables in my database, parts, vehicles, manufacturers, parts_equivalence,vehicle_parts_compatibility. To capture the equivalence information, I created a self-joint table of the parts table, with part_id and equivalent_part_id columns as shown below.

If I implement this way, I will end up with a situation which looks like this:

part_id equivalent_part_id
PN0001 PN0002
PN0002 PN0003
PN0003 PN0004
PN0004 PN0005
PN0005 PN0006
PN0006 PN0007
PN0007 PN0008
PN0008 PN0009
PN0009 PN0010

PN0001 is equivalent to every parts in this table. But, if at some point I wanted to see if PN0001 is equivalent to PN0010, I have to loop through every records in this table.

If PN0001 = PN0002, and PN0002 = PN0003, so PN0001 = PN0003.

I don't think this is the accurate way to do this as we are going to miss a lot of things in this situation.

I am unsure how to model this concept effectively in a database. Any advice on structuring this would be greatly appreciated, especially if you've encountered a similar problem before.

I'll offer some options, but I'm not claiming they're the only possibilities...

Option A - Flat

Rather than directly linking parts-to-parts, put them into families of equivalent parts, i.e.

part_id family_code
PN0001 fam1
PN0001 fam3
PN0002 fam1
PN0003 fam1
PN0004 fam1
PN0005 fam3

(possible simplification: if you know that parts are only ever a part of a single family -- i.e. you never have a part B that can replace either A or C, but C can't replace A and A can't replace C -- then you could get rid of parts_equivalence and just put the family_code field directly on the parts table.)

To find all parts that can replace PN0001, join twice -- once to find what parts-family (or families, in this case: fam1, fam3) PN0001 is part of, then again to see what other parts are members of those parts-families:

select e1.part_id, e2.part_id
from parts_equivalence e1
inner join parts_equivalent e2 on e2.family_code = e1.family_code
where e1.part_id = 'PN0001'

Giving these parts-families an identity isn't necessarily easy though; you could try:

  • use the "original" part number, if it's always known (but if it were, you'd probably already be using it in your example, instead of chaining off the immediately-prior part_id)
  • use the known replacement part_id (your part_id) to look up what families it's already known to be part of, and copy those down to the newly-arriving equivalent parts (bring them into the family)
    • you could do this as an insert trigger: keep your existing table but extend it with a family column, and at insert, do a lookup for an existing known family for the previous part# and mark the new equivalent part as being part of that same family ... but this only works well if inserting small batches (or 1 row) at a time, and in chronological order -- otherwise you'll end up with "split brain" with different parts that should be in the same family instead forming separate disconnected families, and that's a mess to clean up afterwards.

Option B - Graph (or Tree if the data happens to be very tidy)

You can query directly off the data as you presented it originally, using recursive CTE's (common table expressions) to traverse the graph of possible replacements:

with recursive 
bothways as (
  -- if you can replace A with B, you can probably replace B with A too?
  -- this might not be true! if not, either
  -- * use parts_equivalence as-is in the queries below (and remove this chunk entirely), with the expectation that you'll insert two rows into it when A can replace B and B can replace A
  -- * add a boolean field to parts_equivalence to indicate which rows are bi-directional, and only conditionally double them up here (might keep your source data a little cleaner)
  select part_id, equivalent_part_id from parts_equivalence
  union all
  select equivalent_part_id, part_id from parts_equivalence
  -- where parts_equivalence.bidirectional = true
), 
findemall as (
   -- first find everything we might want to replace
   select 
      part_id as broken_part, part_id as possible_replacement,
      -- we can VERY easily create cycles in our graph (especially with bothways)
      -- leave ourselves breadcrumbs and avoid doubling-back on ourselves
      array[part_id]::text[] as alreadyseen
   from parts
   union all
   -- then start finding replacements, and replacements-for-replacements...
   select findemall.broken_part, e.equivalent_part_id,
      alreadyseen || array[e.equivalent_part_id]::text[]
   from findemall 
   inner join bothways e on e.part_id = findemall.possible_replacement
   -- don't keep following trails to parts we've already seen along this path
   where not (findemall.alreadyseen && array[e.equivalent_part_id]::text[])
),
-- you could also generate family data on the fly (this chunk is optional)
intofamily as (
   select broken_part, min(possible_replacement) as family_root
   from findemall group by broken_part
)
-- we can now ask about a single part, or many parts, as a starting point
select broken_part, possible_replacement
from findemall
where broken_part = 'PN0001'
/* equivalent to option A, but using parts-family data generated from the graph on the fly
select f1.broken_part, f2.broken_part
from intofamily f1
inner join intofamily f2 on f2.family_root = f1.family_root
*/
1 Like