Making searchable SQL-dB of breached passwords
Posted: 2019-02-27 08:43am
So, a few weeks ago, another breach was announced on https://haveibeenpwned.com/.
I checked my email addresses there, and sure thing, my main address was leaked. This, however, does not necessarily mean that my passwords have been breached, and thankfully, you can check it on the site above, and the process is probably fine. Your password is never sent to the site, but the first letters of the sha-1 hash and all hashes with those letters are then downloaded to your computer for a check.
However, if you're a bit paranoid, like me, this just won't do it.
I made a local database to check my passwords offline. They can, of course, be breached but not be in this collection, but if it is in this collection, change it immediately. SHA-1 is now considered unsafe and it might only be a matter of time before it is being brute forced. This collection does not have login info connected to it as a security measure, but other lists do.
Make sure to have at least 250Gb free space on the drive you're using for the database. Database takes around 40Gb and log file will be up to 190Gb, but can be truncated later.
So... I downloaded and installed MS SQL Server Developer (free for non commercial single use), then SSMS to have the coding environment.
Then I downloaded the SHA-1 password base (9.8 Gb compressed, 24Gb uncompressed. (https://downloads.pwnedpasswords.com/pa ... hash-v4.7z) and unzipped it with 7-zip.
The rest was code:
First, to build the database:
This might take a while. On my i7-4790K with 32Gb RAM and Samsung Evo 970 500Gb drive it took around 30 minutes.
Now, you can truncate the log. You really don't need it anymore.
Then the code to check my passwords locally. I do this by creating a table with my passwords, then encrypts them with SHA-1 and checks it against dbo.collection1. This is where you really see the difference between indexed and non indexed tables. This code takes less than a second to run:
The result (I used some bleeding obvious passwords that people use for reference. "qwerty" is used 3 810 555 times according to this password breach.
I guess I didn't have to post this, but the inner nerd in me drives me, and I know there are others here.
I checked my email addresses there, and sure thing, my main address was leaked. This, however, does not necessarily mean that my passwords have been breached, and thankfully, you can check it on the site above, and the process is probably fine. Your password is never sent to the site, but the first letters of the sha-1 hash and all hashes with those letters are then downloaded to your computer for a check.
However, if you're a bit paranoid, like me, this just won't do it.
I made a local database to check my passwords offline. They can, of course, be breached but not be in this collection, but if it is in this collection, change it immediately. SHA-1 is now considered unsafe and it might only be a matter of time before it is being brute forced. This collection does not have login info connected to it as a security measure, but other lists do.
Make sure to have at least 250Gb free space on the drive you're using for the database. Database takes around 40Gb and log file will be up to 190Gb, but can be truncated later.
So... I downloaded and installed MS SQL Server Developer (free for non commercial single use), then SSMS to have the coding environment.
Then I downloaded the SHA-1 password base (9.8 Gb compressed, 24Gb uncompressed. (https://downloads.pwnedpasswords.com/pa ... hash-v4.7z) and unzipped it with 7-zip.
The rest was code:
First, to build the database:
Code: Select all
-- Make database
USE [master]
GO
drop database if exists passord
-- Location of DB. Make sure you have around 250Gb
CREATE DATABASE [passord] ON PRIMARY
( NAME = N'passord_Data', FILENAME = N'C:\tmp\passord_Data.mdf' , SIZE = 20GB , MAXSIZE = 100GB, FILEGROWTH = 16384KB )
LOG ON
( NAME = N'passord_Log', FILENAME = N'C:\tmp\passord_Log.ldf' , SIZE = 1Gb , MAXSIZE = 250GB , FILEGROWTH = 16384KB )
GO
use passord
drop table if exists dbo.collection1
create table dbo.collection1
(
pwd char(40) not null, -- always $40
ant int
) on [PRIMARY]
with (data_compression = page)
--Optimize for reading a lot of data
ALTER DATABASE passord SET RECOVERY SIMPLE;
--Now, bulk insert of total 550 000 000 #.
--7C4A8D09CA3762AF61E59520943DC26494F8941B:23174662
BULK INSERT dbo.collection1 FROM 'c:\tmp\pwned-passwords-sha1-ordered-by-hash-v4.txt'
WITH
(
DATAFILETYPE = 'char',
FIELDTERMINATOR = ':',
ROWTERMINATOR = '\n'
);
--Index. This is _very_ important. By making the index after the table, things should go faster
-- File is already sorted on pwd.
create unique clustered index idx on dbo.collection1(pwd) with (data_compression = page) on [PRIMARY];
--Checking
select top 100 * from passord.dbo.collection1
Now, you can truncate the log. You really don't need it anymore.
Then the code to check my passwords locally. I do this by creating a table with my passwords, then encrypts them with SHA-1 and checks it against dbo.collection1. This is where you really see the difference between indexed and non indexed tables. This code takes less than a second to run:
Code: Select all
use passord
drop table if exists #PwdList
create table #PwdList
(
MyPwd varchar(100),
MyPwdH char(40)
)
insert into #PwdList (MyPwd)
Values
('password'),
('Password'),
('qwerty'),
('abc123'),
('123456'),
('abcdefgh')
-- your passwords here. Make sure you don't have online backup tools that backs up the table or program.
;
update #PwdList
Set MyPwdH = upper( sys.fn_varbintohexsubstring(0, HashBytes('SHA1', MyPwd), 1, 0));
--select * from #PwdList
select top 1000 * from passord.dbo.collection1 C
inner join #PwdList M on C.pwd = M.MyPwdH
order by ant desc
The result (I used some bleeding obvious passwords that people use for reference. "qwerty" is used 3 810 555 times according to this password breach.
Code: Select all
pwd ant MyPwd MyPwdH
7C4A8D09CA3762AF61E59520943DC26494F8941B 23174662 123456 7C4A8D09CA3762AF61E59520943DC26494F8941B
B1B3773A05C0ED0176787A4F1574FF0075F7521E 3810555 qwerty B1B3773A05C0ED0176787A4F1574FF0075F7521E
5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8 3645804 password 5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8
6367C48DD193D56EA7B0BAAD25B19455E529F5EE 2834058 abc123 6367C48DD193D56EA7B0BAAD25B19455E529F5EE
8BE3C943B1609FFFBFC51AAD666D0A04ADF83C9D 117316 Password 8BE3C943B1609FFFBFC51AAD666D0A04ADF83C9D
425AF12A0743502B322E93A015BCF868E324D56A 58653 abcdefgh 425AF12A0743502B322E93A015BCF868E324D56A