Pastebin gatherings
->
Search the Pastebin for interesting stuff by using some Yara rules to map for keywords and expressions. After that, display your getherings on a nice splunk view for making your findings more convenient.
This is still work in progres, but could be done so far is pretty straight forward...
Prerequisits
Get yourself a pro account from pastebin to be able to scrath pastebin. And eventually an account at GitHub to retrieve gists as well.
Find your Pastebin API here:
And whitelist your office IP at Pastbin. -> here
Install PasteHunter (and give respect to TheHermit)
Do a "git clone https://github.com/kevthehermit/PasteHunter" to receive the actual programm doing the work.
Install the missing stuff with "pip3 install -r requirements.txt"
Adjusting the config
at settings.json adjust the appropriate fields to your own data:
{
"inputs": {
"pastebin":{
"enabled": true,
"module": "inputs.pastebin",
"api_scrape": "https://pastebin.com/api_scraping.php",
"api_raw": https://pastebin.com/api_scrape_item.php?i=[your pastbin API key],
"paste_limit": 200,
"store_all": false
},
"dumpz": {
"enabled": false,
"module": "inputs.dumpz",
"api_scrape": "https://dumpz.org/api/recent",
"api_raw": "https://dumpz.org/api/dump",
"paste_limit": 200,
"store_all": false
},
"gists": {
"enabled": true,
"module": "inputs.gists",
"api_token": "[your Git oAuth key]",
"api_limit": 100,
"store_all": false,
"user_blacklist": [],
"file_blacklist": ["grahamcofborg-eval-package-list"]
}
},
Ensure you've enable the JSON output in the config:
"json_output": {
"enabled": true,
"module": "outputs.json_output",
"classname": "JsonOutput",
"output_path": "logs/json/",
"store_raw": true,
"encode_raw": true
},
P.S. As per the latest issue, check the comma at the [general] section is set after the "run_frequency" line.
Every paste gathered is checked against the yara rule logic that is stored in the corresponding Yara directory. - Check it out and write your own rules.
Depending on the installation directory, give it a try
/usr/bin/python3 /opt/pastehunter/pastehunter.py
The output shoud look something like:
INFO:pastehunter.py:Reading Configs
INFO:pastehunter.py:Configure Inputs
INFO:pastehunter.py:Enabled Input: gists
INFO:pastehunter.py:Enabled Input: pastebin
INFO:pastehunter.py:Configure Outputs
INFO:pastehunter.py:Enabled Output: elastic_output
INFO:pastehunter.py:Enabled Output: syslog_output
INFO:pastehunter.py:Enabled Output: csv_output
INFO:pastehunter.py:Enabled Output: json_output
INFO:pastehunter.py:Compile Yara Rules
INFO:pastehunter.py:Enable Blacklist Rules
INFO:pastehunter.py:Populating Queue
INFO:pastehunter.py:Fetching paste list from inputs.gists
INFO:gists.py:Remaining Limit: 4992. Resets at 2018-02-03T11:38:29
INFO:pastehunter.py:Fetching paste list from inputs.pastebin
INFO:pastehunter.py:Sleeping for 300 Seconds
Check if the corresponding json file has some entrys at (depending on your install directory) at: /opt/pastehunter/logs/json/
Configure Splunk
Install a SPlunkForwarder on your machine and get the data into your splunk.
The inputs.conf that works at my end is the below one.
[monitor:///opt/pastehunter/logs/json/*]
disabled = false
renderXml = true
index = main
sourcetype = Pastebin
Create the right sourcetype at you splunk server.
/opt/splunk/etc/system/local/props.conf
[Pastebin]
DATETIME_CONFIG =
INDEXED_EXTRACTIONS = json
KV_MODE = true
NO_BINARY_CHECK = true
category = Structured
description = JavaScript Object Notation format. For more information, visit http://json.org/
disabled = false
pulldown_type = true
SHOULD_LINEMERGE = true
LINE_BREAKER=([\r\n]+)\{
Job done ...
Now start everything and watch the sourcetype=Pastebin in Splunk.
I've created me a view with some searches that could act as an excample for you
sourcetype=pastebin | stats distinct_count(raw_paste) as pastes
sourcetype=pastebin "YaraRule{}"=* | rename "YaraRule{}" as Topics | stats count by Topics
sourcetype=pastebin "YaraRule{}"=* raw_paste=* | fillnull value="n.a." | rename "YaraRule{}" AS topic | eval result=raw_paste."--> ".scrape_url." <--"| stats count by topic title type syntax result
Already found an interesting paste of the quite personal data of a leak.
Check out yourself for cool stuff and if you find stuff being written about you, your family and your company. I'm doing this simply by adding a regex part within the search. (| regex _raw="mpauli.de|xxx|xxx|xxx|xxx|xxx" |)
email_list | Saudi Arab govt ambassadors data leaked By Touseef Jaskani | n.a. | text | Saudi Goverment Offical Ambassadors Personal Database Leaked by Touseef Jaskani Officals leaks Get Official leak on Www.Dleets.com name passport bir_date sex s_cell email sup_email منير حميد احمد الحمادي Muneer Hamid Ahmed Al-Hammadi 0003326851 15/6/1979 m 0597156808 eng.muneer2008@gmail.com ghulam@ksu.edu.sa شاجع على احمد غالب shagae ali ahmed ghaleb 0004602950 2-5-1985 m 0590073931 shaga_111@yahoo.com جامعة الملك سعود - سكن الطلاب تعز - اليمن حسين سالم علي الحريبي Hussein Salem Ali AL-Huraibi 002286644 13/11/1984 m 0538254705 husseinalhuraibi7014@gmail.com ابرق الرغامة - جدة شبوة - بيحان حلمي محمد محمد صلاح Helmi Mohammed Mohammed Salah 0003921534 30/11/1992 m 0533899304 432106929@student.ksu.edu.sa جامعة الملك سعود السكن الجامعي حضرموت -المكلا فهد عبد القادر عبد الله الهتار fahd abdulqader abdallah alhetar 002814670 01/01/1977 m 0591465281 fahd077@yahoo.com الرياض - جامعة الملك سعود إب - الظهار إبراهيم محمد محمد محرم Ebrahim Mohammed mohammed moharrm 003386386 01/01/1985 m 0548046695 moharrm85@yahoo.com جامعة الملك سعود صنعاء - مديرية السبعين - شارع بينون توفيق عبدة صالح عوض Taufiq Abdh Saleh 00804321 5/9/1972 m 00966506065323 abo_nor1@hotmail.com جامعة الملك فهد الدمام المنطقة الشرقية الحديدة - الحي التجاري بشائر عبدالله حسن حسين bashayr abdullah hassan hussain 01 29/7/1991 f 0544402724 GHROO00OOR.ONTHA@HOTMAIL.COM جدة مشاعل محمد عبدالرحمن العمودي mashael mohammed abdulrahman al amoudi 00347960 مضافه 1/ 7 /1415 هـــ f 0553517277__ 05608 trke99@hotmail.com مكه المكرمه __ العزيزيه الجنوبيه لايوجد محمد محمد قائد محمد Mohammad Mohammad Qaid Mohammad 002500623 30/05/1982 m 0595793575 abuaymenmh@gmail.com malsalhy@gmail.com السكن الجامعي - جامعة الملك سعود المسراخ-تعز سعاد حمود عوضه Souad Hammoud odah 01340458 1974l f 0507771722 dakd@lkf.ckj عبدالله صالح محمد الجفري Abdullah Saleh Mohammed Algefri 01353823 1414/02/26 m 0590526875 abood4my@hotmail.com الدمام حي النخيل ريناد سالم علي الكاف renad salem ali alkaf 01333255 25/8/1993 f 0562783833 - المنز renad.k@live.com جده - حي الزهراء - شارع حلمي كتبي عمر سعيد علي باسالم omar saeed ali basalem 01632842 1995/01/26م m 0532337121 kil1ler@windowslive.com الطايف حي العقيق شارع ا--> https://pastebin.com/api_scrape_item.php?i=jSVmH4v0 <-- |
supervisor config
To enable the whole thing within your supervisord, I created below file:
/etc/supervisor/conf.d/pastehunt.conf
[program:pastehunter]
command=/usr/bin/python3 /opt/pastehunter/pastehunter.py
directory=/opt/pastehunter/
process_name = pasthunter%(process_num)d
startsecs = 20
autostart = true
autoretstart = true
user = root
stderr_logfile=/var/log/pastehunter_error.log
stdout_logfile=/var/log/pastehunter.log