Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon. Entire thread

efficient schema design [poké]

Name: Anonymous 2010-06-14 17:07

Hello /prog/, I am a Pokétrainer-turned-businessman. I run a Pokébusiness where I license access to Pokémon via PC.

A big part of my business's success is that I license Pokémon individually, but also by package deal. For example, I have a Grass family package, a Fire family package, and so on. In fact, in my system, all my Pokémon are hierarchically organized this way (and in case you're wondering, with dual-type Pokémon, I only go by the first type). So if a customer licenses one Charizard, the system understands that the customer is actually singling out a Pokémon from the Fire family. Pokémon (category) -> Fire (subcategory) -> Charizard (product). It's a multi-key index!

So I need to track my Pokélicenses in my RDBMS, but I'm having trouble designing the schema correctly, as I'm unsure about unnecessary repetition of data. The thing is, my Fire family may increase in the future, as I go on to catch more Pokémon and add them to my company's collection. (I have people out hunting for Magmar at Pokémon Mansion as we speak!) But if I add a Magmar to the Fire family, that doesn't mean that everyone who has purchased the Fire family package deal beforehand suddenly gets a license to choose Magmar. No sir! That means when tracking Pokélicenses, jotting down just the Family is not sufficient when handling package deals (category -> subcategory). I need to write down what was in the Family when they bought it (category -> subcategory -> product)!


+--------------+
| licenses     |
+--------------+
| id           |
| client_id    |
| created      |
+--------------+

+----------------+
| license_items  |
+----------------+
| id             |
| license_id     |
| category_id    |
| subcategory_id |
| product_id     | < NOT NULL!
+----------------+


Like so. So a license might look like this:



id   client_id  created
---  ---------  -------------------
121  1031       2009-11-12 12:07:39

id  license_id  category_id  subcategory_id  product_id 
--  ----------  -----------  --------------  ----------
1   121         1 (Pokémon)  2 (Grass)       1 (Bellsprout)
2   121         1            2               2 (Bulbasaur)
3   121         1            2               3 (Ivysaur)
4   121         1            2               4 (Oddish)


Well here's the problem. As I said, the Family deals are the biggest reason I'm in business. So I imagine that there will be a lot of licenses where the same entries, in the same order, will appear over and over -- until I add to the Family, at which point it'll be mostly the same, except with one more line. The database will grow quickly with a lot of redundant info.

But is that actually a problem? I'm obsessed with cutting down redundancy, but I fear that I'm actually worrying too much. Also, if there is a better way, I'm not sure what it is. I considered another table, but it quickly became an abomination responsible for tracking subtle permutations between Families and various collections of Pokémon. What a nightmare!

So, any ideas? Am I on the right Poképath, or am I being denormalized?!?

Oh, and excuse me if you feel I've been verbose in explaining this issue. I'm a friendly businessman, you see. Cheers!

Name: Anonymous 2010-06-14 20:43

>>6
Yes, well, the economy is pretty unpredictable here in Vermillion City. Supply and demand fluctuates so quickly. Bigger is not always better. I can hardly keep track of it myself!

>>8
I'll have to remember this technique for later. Here it may be overkill, but I think there is another place in this schema design that could make great use of a playback mechanic!

>>7
Half of the problem stems from this abysmal decision.
Actually, I don't really sell Pokemon. This is all just a metaphor for the business I'm actually designing for. How sad.... Anyway, in this business, the hierarchy used to organize products doesn't actually have multiple types, unlike Pokemon. Regardless, your second suggestion sounds like it could work!

I do still need to track individual Pokemon purchases however, because a trainer might be interested in, say,  just a Charizard. So I could end up with something like this:


+--------------+
| licenses     |
+--------------+
| id           |
| client_id    |
| created      | < DATETIME
+--------------+

+----------------+
| license_items  |
+----------------+
| id             |
| license_id     |
| category_id    |
| subcategory_id |
| product_id     | < DEFAULT NULL
+----------------+

+----------------+
| products       |
+----------------+
| category_id    |
| subcategory_id |
| product_id     |
| created        | < DATETIME
| name           |
| base_price     |
+----------------+


...very similar to what you propose, which would allow me to conclude that if the product_id is NULL, then that row refers to a licensing of an entire Family (subcategory) -- from here, I can check all the Pokemon in that Family against the timestamp as you demonstrate.

Luckily, the design of this table introduces what I thought was a flaw, but is a actually a feature: for example, I could have the entire Fire Family in one license, but also have a second row in the same license for a single Charizard -- which would seem redundant at first, like something that the design of the table should not allow. But actually, there is the possibility that a customer would like to order some extra backup Charizards on top of their basic full Fire family order. Narrowly dodged a hyper beam right there!

This table's behavior is unruly like this because -- I think this is why, anyway -- I'm forcing the use of program logic to determine the meaning of things in the table, rather than having it be explicitly determined by table structure. (I am arbitrarily deciding that a NULL product_id means something significant, rather than it simply being missing data.) It seems like it will work in this case, and I think it's what I'm going to do -- but generally speaking, is this bad design?

Newer Posts
Don't change these.
Name: Email:
Entire Thread Thread List