Skip to main content

Car or Auto Make-Model-Year Database : For Breakfast

Make Model What?

If you like me were tasked with loading a database of recent car makes/models/years, you would start by looking on the web and seeing if someone else just has it out there, readily available, hopefully for free, but perhaps for a tiny nominal fee.?

If only it was that simple...

I looked and looked, and couldn't find anything that would fit the above requirements. So I thought, who would know about US car models better than Kelly Blue Book? So I went on their site, and sure enough they have a javascript file that lists all known to them makes and models of used cars. Since the file is public, I figured it's not really "evil" if I scrape and parse it for my own benefit. Disagree? Have a better source? Then leave a comment.

Anyway, to cut the long story short, I'm hoping to save a day or so to someone else who may, like me, be looking for this information. The ruby module shown below retrieves and parses the javascript from KBB site into a Ruby data structure of the following form - basically a hash, keyed on make, then on model with list of years as a value:

>> Constants::Auto::DB.keys.sort[0..5]
=> ["AMC", "Acura", "Alfa Romeo", "Audi", "BMW", "Bertone"]
>> Constants::Auto::DB["Subaru"].keys.sort[0..5]
=> ["B9 Tribeca", "Baja", "DL", "Forester", "GL", "GL-10"]
>> Constants::Auto::DB["Audi"]["A4"]
=> ["1999", "2007", "1998", "2006", "2005", "1996", "2004", "2003", "2002", "1997", "2001", "2000"]
>> Constants::Auto::DB["BMW"]["X5"]
=> ["2003", "2002", "2001", "2000", "2005", "2007", "2006", "2004"]

The idea is that you could load the initial hash: @models = KBB::Parser.new.to_hash and then save the output of @models.inspect in your local constants file - hence me using Constants::Auto::DB (I actually have a Rake task for doing this -- let me know if I should post it too). Then you would just re-run this every time you think new car models are added/changed on KBB. Realize, that hitting their site every time you need the data is clearly evil. So use this class to load the data initially, save the result of inspect() call into a ruby file, and use that cached version in your app. Re-run the load every time you want to update your database.

Please let me know if you find this code useful, or if you find a better/cleaner/more comprehensive way of maintaining car make/model/year database.

#
# author: Konstantin Gredeskoul © 2008
# license: public domain
#
require 'net/http'
require 'uri'

module KBB
 MODELS_URL = "http://scripts.kbb.com/kbb/ymmData.axd?VehicleClass=UsedCar"

 class Models
   def initialize(js)   
     @models = {}
     @makes = {}
     n = /ymUsed_\[\d{4}\]\s*=\s*'([^']+)'/
     m = /ymmUsed_\["(\d+)~(\d+)"\]\s*=\s*"([^"]+)"/
     js.split(/\n/).each do |line|
       next if line.strip.blank?
       if matched = n.match(line)
         matched[1].split(/,/).each do |token|
           id, name = token.split('|')
           @makes[id.to_i] = name
         end
       end
      
       if matched = m.match(line)
          year, make_id, models = matched[1], matched[2], matched[3]
          models.split(/,/).each do |t| 
            id, model_name = t.split('|')
            make_name = @makes[make_id.to_i]
            @models[make_name] ||= {}
            @models[make_name][model_name] ||= []
            @models[make_name][model_name] << year
          end
        end
      end
    end
    
    def to_hash
      @models
    end
  end

  class Parser
    def initialize
      @m = Models.new(Net::HTTP.get(URI.parse(MODELS_URL)))
    end
    def to_hash
      @m.to_hash
    end
  end

end

Comments

Anonymous said…
hi

thank you for this - what a great idea

i was wondering what language is that in?

i was given the task to compile this database in one week, but spent a lot of time looking for a list, but no dice...

so if i can make this work, then you saved my day

thanks

kristine
It's written in Ruby. Email me in case you are having troubles running it.
Kristine said…
oops, i realized that after i re-read , lol...

thought i'd ask... do you have a php version of it by chance?

either way, this is super helpful!

thanks again ya

kristine
Kristine said…
hey konstantin,

here is a basic PHP array:

$file = file('./ymmData.axd');



$patternMake = '/ymUsed_\[\d{4}\]\s*=\s*\'([^\']+)\'/';

$patternModel = '/ymmUsed_\["(\d+)~(\d+)"\]\s*=\s*"([^"]+)"/';



foreach($file as $row) {

if(preg_match($patternMake,$row,$matched))

{

$tmpMakes = explode(',',$matched[1]);

foreach($tmpMakes as $str) {

list($id,$name) = explode("|",$str);

$arrMakes[$id] = $name;

}

}

unset($str);

if(preg_match($patternModel,$row,$matched))

{

$year = $matched[1];

$make_id = $matched[2];

$models = $matched[3];

$tmpModels = explode(',',$models);

foreach($tmpModels as $str) {

list($id,$model_name) = explode("|",$str);

$make_name = $arrMakes[$make_id];

$arrModels[$make_name][$model_name][$year] = $year;

}

}

}

ksort($arrModels);

echo
Lester said…
This comment has been removed by the author.
Lester said…
This is exactly what I am looking for, but I am not exactly sure how this is supposed to be setup. Can you please enlighten me?
Lester, it's a ruby script - it's supposed to be run using ruby interpreter. Some knowledge of programming is required to be able to take advantage of this code.
ntv1534 said…
This comment has been removed by the author.
ntv1534 said…
Wow, great job doing this! I'm familiar with Python but not Ruby, though they look pretty similar. Is that built-in regular expression support I see? Savage...

For other people who are making ASP.NET websites or don't want to parse a .axd file, it looks like the Selection Service is directly queryable and addable from here



Seekda has a better description, that allows you to see the results of an HTTP-POST
here

That being said, I still need to figure out how to actually use these properly (Web Service n00b here), but it's good to know they're out there if you don't want to do brute force parsing on the javascript!
matt said…
this is an AMAZING idea, but I'm absolutely lost as to how to get it into a database (I'm running a PHP/SQL setup). any help?
theshirey said…
This is soooo great. I actually need this for a program that I'm writing for a friends Car Audio shop and I'm wondering if you can enlighten me on how I can do this. I'm a newbie at Web Design. I'm actually not going to be using this online at all. It's going to be a stand-alone program. I would greatly appreciate any help.
Dave said…
Can anyone translate that into ASP (Classic)?

What an awesome thing! Thanks!
Anonymous said…
I could perhaps do the file in ASP if I understood the code better...

Could someone "comment" the code to explain what each line is doing?

Thanks!
Anonymous said…
I will pay someone to translate this code into ASP, or VB6.

Contact: jojobinkus@hotmail.com

Thanks

PS - the main URL for this file has changed to: http://scripts.kbb.com/kbb/ymmData.axd?VehicleClass=UsedCar
Anonymous said…
Thanks so much! A real time saver.
Anonymous said…
Hi,

Anyone know where kbb keeps its pricing data?
Nate said…
This is awesome. Thanks so much for your work on this. I took what you gave as a starting point and created a rake task (rake cars:import), that will create the makes, then a models table (referencing make_id). Using Rails and 'find_or_create_by_*' made it a breeze, and I only have to run it every now and then to keep in sync.

Awesome work.
Murray said…
The data has moved to a new server/url. It can now be found at

http://scripts.kbb.com/kbb/ymmData.axd?VehicleClass=UsedCar
Jason Darrow said…
I'm fairly new to Ruby but found this very helpful and don't want to be a total leech.

I loaded this into a MySQL DB:

require "rubygems"
require "activerecord"

ActiveRecord::Base.establish_connection(
:adapter => "mysql",
:host => "localhost",
:username => "username",
:password => "password",
:socket => "/var/run/mysqld/mysqld.sock",
:database => "vehicleDev")

class Make < ActiveRecord::Base
end

class Model < ActiveRecord::Base
end

class Year < ActiveRecord::Base
end

class PopulateCarsDB

def initialize
@cars = Parser.new
end

#To load data into car Makes table
def loadMakes

@cars.to_hash.keys.each do |makestmp|
Make.create(:make => makestmp)
end
end

def loadModels

all_makes = Make.find(:all)

all_makes.each do |make|
@cars.to_hash[make.make].keys.each do |modelKey|
model = Model.create(:make_id => make.id, :model => modelKey)
load_model_years(model.id, @cars.to_hash[make.make][modelKey])
end#end for models keys loop
end#end for makes loop

end

def load_model_years(model_id, model_yrs_array)
#puts model_id.to_s + " | " + model_yrs_array.inspect
model_yrs_array.each do |year|
Year.create(:model_id => model_id, :year => year)
end#end for model's year loop

end

end

cars = PopulateCarsDB.new
cars.loadMakes
cars.loadModels
CFdude said…
This looks like it could work. Do you have it in ColdFusion or cfscript?
Guto said…
There was a small error on the php code, here is the revised one:

$file = file_get_contents('http://scripts.kbb.com/kbb/ymmData.axd?VehicleClass=UsedCar');
$file = explode("\n",$file);

$patternMake = '/ymUsed_\[\d{4}\]\s*=\s*\'([^\']+)\'/';
$patternModel = '/ymmUsed_\["(\d+)~(\d+)"\]\s*=\s*"([^"]+)"/';

foreach($file as $row) {
if(preg_match($patternMake,$row,$matched)){
$tmpMakes = explode(',',$matched[1]);
foreach($tmpMakes as $str) {
list($id,$name) = explode("|",$str);
$arrMakes[$id] = $name;
}
}
unset($str);

if(preg_match($patternModel,$row,$matched)){
$year = $matched[1];
$make_id = $matched[2];
$models = $matched[3];
$tmpModels = explode(',',$models);
foreach($tmpModels as $str) {
list($id,$model_name) = explode("|",$str);
$make_name = $arrMakes[$make_id];
$arrModels[$make_name][$model_name][$year] = $year;
}
}
}

ksort($arrModels);
print_r($arrModels);

GUTO
MoB$TeR said…
im trying to make this work with mi site, and im getting no were. Can someone plz help me make this one for the new cars and one for the used cars. i'm welling to pay for this project. thanks

contact me at admin(@)karqin(dot)com
nadie said…
Nicely done!
A little modification for the PHP version to sort the results alphabetically:

function getCarsDB()
{
$file = file_get_contents('http://scripts.kbb.com/kbb/ymmData.axd?VehicleClass=UsedCar');
$file = explode("\n",$file);

$patternMake = '/ymUsed_\[\d{4}\]\s*=\s*\'([^\']+)\'/';
$patternModel = '/ymmUsed_\["(\d+)~(\d+)"\]\s*=\s*"([^"]+)"/';

foreach($file as $row) {
if(preg_match($patternMake,$row,$matched)){
$tmpMakes = explode(',',$matched[1]);
foreach($tmpMakes as $str) {
list($id,$name) = explode("|",$str);
$arrMakes[$id] = $name;
}
}
unset($str);

if(preg_match($patternModel,$row,$matched)){
$year = $matched[1];
$make_id = $matched[2];
$models = $matched[3];
$tmpModels = explode(',',$models);
foreach($tmpModels as $str) {
list($id,$model_name) = explode("|",$str);
$make_name = $arrMakes[$make_id];
$arrModels[$make_name][$model_name][$year] = $year;
}
}
}

ksort($arrModels);
foreach ($arrModels as &$make)
{
ksort($make);
foreach ($make as &$model)
ksort($model);
}

return $arrModels;
}
Anonymous said…
Here is my modification to use with Drupal and Uber Cart.

I wonder if anyone has the model trim information too? (3 series - 328, 330, 335, etc)


error_reporting(E_ALL);

$cars = getCarsDB();

$mk = 1;
$mdl = 2;
$yr = 3;
$i = 1;
foreach ($cars as $make => $value)
{
if($make == 'BMW' || $make == 'Mercedes-Benz' || $make == 'Audi' || $make == 'Porsche' || $make == 'Volkswagen')
{
//echo "Make: $make
\n";
$mk = $i;
echo "INSERT INTO drupal_term_data (vid,name,description) VALUES ('2','$make', '$make');
";
echo "INSERT INTO drupal_term_hierarchy (tid,parent) VALUES ('" . $i++ . "','0');
";
//$mk++;
}
else
{
continue;
}

foreach ($value as $model => $value1)
{
//echo "-------Model: $model
\n";
$mdl = $i;
echo "INSERT INTO drupal_term_data (vid,name,description) VALUES ('2','$model', '$model');
";
echo "INSERT INTO drupal_term_hierarchy (tid,parent) VALUES ('" . $i++ . "','" . $mk . "');
";
foreach ($value1 as $year)
{
//echo "----------->>>>>Year: $year
\n";

echo "INSERT INTO drupal_term_data (vid,name,description) VALUES ('2','$year', '$year');
";

echo "INSERT INTO drupal_term_hierarchy (tid,parent) VALUES ('" . $i++ . "','" . $mdl . "');
";
}
$mdl++;
}
$mk++;
}

function getCarsDB()
{
$file = file_get_contents('http://scripts.kbb.com/kbb/ymmData.axd?VehicleClass=UsedCar');

$file = explode("\n",$file);

$patternMake = '/ymUsed_\[\d{4}\]\s*=\s*\'([^\']+)\'/';
$patternModel = '/ymmUsed_\["(\d+)~(\d+)"\]\s*=\s*"([^"]+)"/';

foreach($file as $row)
{
if(preg_match($patternMake,$row,$matched))
{
$tmpMakes = explode(',',$matched[1]);
foreach($tmpMakes as $str)
{
list($id,$name) = explode("|",$str);
$arrMakes[$id] = $name;
}
}

unset($str);

if(preg_match($patternModel,$row,$matched))
{
$year = $matched[1];
$make_id = $matched[2];
$models = $matched[3];
$tmpModels = explode(',',$models);
foreach($tmpModels as $str)
{
list($id,$model_name) = explode("|",$str);
$make_name = $arrMakes[$make_id];
$arrModels[$make_name][$model_name][$year] = $year;
}
}
}

ksort($arrModels);
foreach ($arrModels as &$make)
{
ksort($make);
foreach ($make as &$model)
ksort($model);
}

return $arrModels;
}
Anonymous said…
call me crazy, but that data from kbb only goes up to 2008. shouldn't it list all 2009 models?
Anonymous said…
Hey,
Does anyone know where to find a file for new (2009) car makes and models?

thanks
Anonymous said…
For new car data use this url
http://scripts.kbb.com/kbb/ymmData.axd?VehicleClass=NewCar
Jon said…
Anyone know where I could find Trim data (325i, 330i, etc.) and accessories/options avalable (sun roof, winter package, etc.)?

btw this is great info.

thanks
wazzy said…
Hi Konstantin.. do you happen to know if there is a KBB trim file like there is for the make/model?

Sincerely.
Zachary said…
Total time saver, thank you!
Anonymous said…
To get trim type data you will have to use a vin decoding service. I have used vinpower in the past with success.
Anonymous said…
VinPower is way too expensive for me. thanks for the suggestion though.
shalin said…
I have a complete year wise database of car models in US. I can provide Excel File.
You can contact me on my email id logintomontu@gmail.com . I can also sens you some samples, so you can decide.
Anonymous said…
what's the URL for motorcycle list? can you show the php for motorcycles?
Eddie said…
This looks like what i've been searchng for, can the data from this page be extracted?

http://www.webuyanycar.com/Secure/MakeAndModel.aspx

php would be ideal

Thanks,
eddie@blunt1.com
Customer said…
I could really use a csv file of:
Year/Make/Model/Trim of vehicles from 1960 and up.

Can anyone help me out with this please?

Thanks!
Anonymous said…
Howdy. I'm looking for the total number of make by model by year. I don't need the listings, I just want to find out how many there are? Tens of thousands? Hundreds of thousands? Any help would be appreciated.
can u help us with a php script or with a rip of www.mobile.de :)

thank you very very much
John
William Marek said…
This comment has been removed by the author.
Shane said…
All it is great to have this list, however I am looking for a database or tool that will decode all VIN number from 1981 forward. Does anyone know of such a tool or database. This is to be used in a Law Enforcement application to assist in the identification of vehicles.
Anonymous said…
Hey everyone,

Does anyone have a file that has a table for make, model, and year for cars?

or does anyone know how the original poster accessed that javascript file?

sorry if the question is amateur in nature, am very new to this type of stuff =D
Anonymous said…
Shane, my firm is about ready to come out with a proprietary database of around 30 to 40 million records of car owners by year,make, model and VIN that we will be offeriing for licensing or list rental. The entire database was just recently NCOA and only goes back 24 months. Dan
Suren said…
Does this code link to kbb for the database, so that we can get data as it is updated? or is this a one time deal?

Thanks.

Also, I have an excel database for all Year, Make, Model, if anyone is interested. I am not a programmer, so can trade the database for the code to get the dropdowns on our website.

my e-mail is csargisov@gmail.com
mletendre said…
I am brand new to this whole web programming thing and I am trying to create a webpage with drop downs for make model year and then info for a customer to put in name address phone and then email that info to me by hitting submit. Can this be easily done using this info? any pointers would be great! mletendre@radiantdesigners thats a .com account
Anonymous said…
My address is mletender (at) radiant-designers (dot) com sorry for the typo above.
luis said…
guys, http://scripts.kbb.com/kbb/ymmData.axd?VehicleClass=UsedCar is down ... or it moved, any idea where it went?
Anonymous said…
A free version of the year make and model database is here: yearmakemodeldatabase.com
Anonymous said…
hi how do you add an excel database to your website..im working with yahoo merchant..any help would be appreciated.
thanks
mike
Ray said…
If anyone is in need of a vehicle database with the following:

Years: 1904-2012
Makes
Models
Trims (for those years that had them)

I have this data which was updated yesterday and is ready in MySQL format and CSV format.

I will also include a complete get your started PHP/Ajax script for your site as can be seen here in the demo:

http://www.carlistdb.com/demo/

Please email me at: dev[at]idealws.net if your interested in purchasing this for your site.

Regards,
Ray
Dan said…
I just started a project with this kind of thing in mind. I designed it to be an easy to use javascript API, so that with just a few lines of code you can create dependent year / make / model dropdowns, or browseable lists. Once you drill down to the model level you can also get just about any spec of the car as well.

It's still pretty early on in development, but maybe it can help some people here. If you have any ideas about how it could be more useful, please contact me through the form on my site.

http://www.carqueryapi.com/
Anonymous said…
You can also just get the list with a year of upgrades on http://www.thevehiclelist.com It also comes with the dependent drop downs, PHP/Jquery - as well as flat files. Great customer service too
Anonymous said…
Makes can be found at:
http://www.kbb.com/jsdata/2.1.37.1_40678/_makes?vehicleclass=UsedCar

Models:
http://www.kbb.com/jsdata/2.1.37.1_40678/_modelsyears?vehicleclass=UsedCar

Obviously this is KBB's data and I'm sure they try their hardest to keep it under wraps so no telling how long it will reside there. You probably shouldn't use this data if you're building a competing application with KBB services. Doubt they'd know but you shouldn't feel good about it any.
Anonymous said…
This is missing the relationship between make and model.

Popular posts from this blog

Rails3 and The Inevitable Fragmentation

I remember one of the early talks at the Canada On Rails in 2006 had a slide showing how many books one needs to read to be able to write web applications in Rails, versus Java.Of course Java side had about 10 books: Java SE, Java EE, Hibernate, Struts/Tiles/JSF, WebServices, Ant, Maven, Eclipse, JUnit, etc, etc.The Rails slide proudly showed the now hopelessly outdated "Agile Web Development With Ruby on Rails", 1st edition. Those were the times. Back then, during my work for Blurb.com myself and three other engineers managed to learn ruby, rails and build a fully functional e-commerce site in about 3 months. I was blown away by the productivity gains compared to Java, where months could be spent laying out the project, and creating all necessary infrastructure for builds, deployment, automated testing with or without the database, etc.Fast-forward to 2010. We are on a brink of Rails3 release, and oh boy, has the landscape changed since back then. I would argue that in s…

Why I Like PostgreSQL

Today I gave a short presentation at work about PostgreSQL, and why I much prefer it to MySQL.

PostgreSQL vs MySQL: Eternal Battle
I may be misreading this, but it seems that there is a recent trend within startups to move away from MySQL, probably thanks to folks like Heroku on one side (who use PostgreSQL to the extreme, and help and contribute to it's development), vs folks like Oracle on the other side, tainting the "open source pureness" of MySQL :)

At my work we currently use a mid-sized MySQL 5.1 Percona instance, which is holding up quite well I must admit. Both PostgreSQL and MySQL have definitely converged to cover most features that people want, but my leaning is still towards PostgreSQL. I just agree with it's focus on data integrity, recovery, constraints, extensibility, while some of the early decisions in MySQL's design do not agree with me at all (like truncating long strings, 1/0 instead of booleans, ambiguous group by, etc). I think that data …