80,443 - Pentesting Web Methodology

Learn AWS hacking from zero to hero with htARTE (HackTricks AWS Red Team Expert)!

Other ways to support HackTricks:

If you are interested in hacking career and hack the unhackable - we are hiring! (fluent polish written and spoken required).

Basic Info

The web service is the most common and extensive service and a lot of different types of vulnerabilities exists.

Default port: 80 (HTTP), 443(HTTPS)

PORT    STATE SERVICE
80/tcp  open  http
443/tcp open  ssl/https
nc -v domain.com 80 # GET / HTTP/1.0
openssl s_client -connect domain.com:443 # GET / HTTP/1.0

Web API Guidance

pageWeb API Pentesting

Methodology summary

In this methodology we are going to suppose that you are going to a attack a domain (or subdomain) and only that. So, you should apply this methodology to each discovered domain, subdomain or IP with undetermined web server inside the scope.

Server Version (Vulnerable?)

Identify

Check if there are known vulnerabilities for the server version that is running. The HTTP headers and cookies of the response could be very useful to identify the technologies and/or version being used. Nmap scan can identify the server version, but it could also be useful the tools whatweb, webtech or https://builtwith.com/:

whatweb -a 1 <URL> #Stealthy
whatweb -a 3 <URL> #Aggresive
webtech -u <URL>
webanalyze -host https://google.com -crawl 2

Search for vulnerabilities of the web application version

Check if any WAF

Web tech tricks

Some tricks for finding vulnerabilities in different well known technologies being used:

Take into account that the same domain can be using different technologies in different ports, folders and subdomains. If the web application is using any well known tech/platform listed before or any other, don't forget to search on the Internet new tricks (and let me know!).

Source Code Review

If the source code of the application is available in github, apart of performing by your own a White box test of the application there is some information that could be useful for the current Black-Box testing:

  • Is there a Change-log or Readme or Version file or anything with version info accessible via web?

  • How and where are saved the credentials? Is there any (accessible?) file with credentials (usernames or passwords)?

  • Are passwords in plain text, encrypted or which hashing algorithm is used?

  • Is it using any master key for encrypting something? Which algorithm is used?

  • Can you access any of these files exploiting some vulnerability?

  • Is there any interesting information in the github (solved and not solved) issues? Or in commit history (maybe some password introduced inside an old commit)?

pageSource code Review / SAST Tools

Automatic scanners

General purpose automatic scanners

nikto -h <URL>
whatweb -a 4 <URL>
wapiti -u <URL>
W3af
zaproxy #You can use an API
nuclei -ut && nuclei -target <URL>

# https://github.com/ignis-sec/puff (client side vulns fuzzer)
node puff.js -w ./wordlist-examples/xss.txt -u "http://www.xssgame.com/f/m4KKGHi2rVUN/?query=FUZZ"

CMS scanners

If a CMS is used don't forget to run a scanner, maybe something juicy is found:

Clusterd: JBoss, ColdFusion, WebLogic, Tomcat, Railo, Axis2, Glassfish CMSScan: WordPress, Drupal, Joomla, vBulletin websites for Security issues. (GUI) VulnX: Joomla, Wordpress, Drupal, PrestaShop, Opencart CMSMap: (W)ordpress, (J)oomla, (D)rupal or (M)oodle droopscan: Drupal, Joomla, Moodle, Silverstripe, Wordpress

cmsmap [-f W] -F -d <URL>
wpscan --force update -e --url <URL>
joomscan --ec -u <URL>
joomlavs.rb #https://github.com/rastating/joomlavs

At this point you should already have some information of the web server being used by the client (if any data is given) and some tricks to keep in mind during the test. If you are lucky you have even found a CMS and run some scanner.

Step-by-step Web Application Discovery

From this point we are going to start interacting with the web application.

Initial checks

Default pages with interesting info:

  • /robots.txt

  • /sitemap.xml

  • /crossdomain.xml

  • /clientaccesspolicy.xml

  • /.well-known/

  • Check also comments in the main and secondary pages.

Forcing errors

Web servers may behave unexpectedly when weird data is sent to them. This may open vulnerabilities or disclosure sensitive information.

  • Access fake pages like /whatever_fake.php (.aspx,.html,.etc)

  • Add "[]", "]]", and "[[" in cookie values and parameter values to create errors

  • Generate error by giving input as /~randomthing/%s at the end of URL

  • Try different HTTP Verbs like PATCH, DEBUG or wrong like FAKE

Check if you can upload files (PUT verb, WebDav)

If you find that WebDav is enabled but you don't have enough permissions for uploading files in the root folder try to:

  • Brute Force credentials

  • Upload files via WebDav to the rest of found folders inside the web page. You may have permissions to upload files in other folders.

SSL/TLS vulnerabilites

  • If the application isn't forcing the user of HTTPS in any part, then it's vulnerable to MitM

  • If the application is sending sensitive data (passwords) using HTTP. Then it's a high vulnerability.

Use testssl.sh to checks for vulnerabilities (In Bug Bounty programs probably these kind of vulnerabilities won't be accepted) and use a2sv to recheck the vulnerabilities:

./testssl.sh [--htmlfile] 10.10.10.10:443
#Use the --htmlfile to save the output inside an htmlfile also

# You can also use other tools, by testssl.sh at this momment is the best one (I think)
sslscan <host:port>
sslyze --regular <ip:port>

Information about SSL/TLS vulnerabilities:

Spidering

Launch some kind of spider inside the web. The goal of the spider is to find as much paths as possible from the tested application. Therefore, web crawling and external sources should be used to find as much valid paths as possible.

  • gospider (go): HTML spider, LinkFinder in JS files and external sources (Archive.org, CommonCrawl.org, VirusTotal.com, AlienVault.com).

  • hakrawler (go): HML spider, with LinkFider for JS files and Archive.org as external source.

  • dirhunt (python): HTML spider, also indicates "juicy files".

  • evine (go): Interactive CLI HTML spider. It also searches in Archive.org

  • meg (go): This tool isn't a spider but it can be useful. You can just indicate a file with hosts and a file with paths and meg will fetch each path on each host and save the response.

  • urlgrab (go): HTML spider with JS rendering capabilities. However, it looks like it's unmaintained, the precompiled version is old and the current code doesn't compile

  • gau (go): HTML spider that uses external providers (wayback, otx, commoncrawl)

  • ParamSpider: This script will find URLs with parameter and will list them.

  • galer (go): HTML spider with JS rendering capabilities.

  • LinkFinder (python): HTML spider, with JS beautify capabilities capable of search new paths in JS files. It could be worth it also take a look to JSScanner, which is a wrapper of LinkFinder.

  • goLinkFinder (go): To extract endpoints in both HTML source and embedded javascript files. Useful for bug hunters, red teamers, infosec ninjas.

  • JSParser (python2.7): A python 2.7 script using Tornado and JSBeautifier to parse relative URLs from JavaScript files. Useful for easily discovering AJAX requests. Looks like unmaintained.

  • relative-url-extractor (ruby): Given a file (HTML) it will extract URLs from it using nifty regular expression to find and extract the relative URLs from ugly (minify) files.

  • JSFScan (bash, several tools): Gather interesting information from JS files using several tools.

  • subjs (go): Find JS files.

  • page-fetch (go): Load a page in a headless browser and print out all the urls loaded to load the page.

  • Feroxbuster (rust): Content discovery tool mixing several options of the previous tools

  • Javascript Parsing: A Burp extension to find path and params in JS files.

  • Sourcemapper: A tool that given the .js.map URL will get you the beatified JS code

  • xnLinkFinder: This is a tool used to discover endpoints for a given target.

  • waymore: Discover links from the wayback machine (also downloading the responses in the wayback and looking for more links

  • HTTPLoot (go): Crawl (even by filling forms) and also find sensitive info using specific regexes.

  • SpiderSuite: Spider Suite is an advance multi-feature GUI web security Crawler/Spider designed for cyber security professionals.

  • jsluice (go): It's a Go package and command-line tool for extracting URLs, paths, secrets, and other interesting data from JavaScript source code.

  • ParaForge: ParaForge is a simple Burp Suite extension to extract the paramters and endpoints from the request to create custom wordlist for fuzzing and enumeration.

Brute Force directories and files

Start brute-forcing from the root folder and be sure to brute-force all the directories found using this method and all the directories discovered by the Spidering (you can do this brute-forcing recursively and appending at the beginning of the used wordlist the names of the found directories). Tools:

  • Dirb / Dirbuster - Included in Kali, old (and slow) but functional. Allow auto-signed certificates and recursive search. Too slow compared with th other options.

  • Dirsearch (python): It doesn't allow auto-signed certificates but allows recursive search.

  • Gobuster (go): It allows auto-signed certificates, it doesn't have recursive search.

  • Feroxbuster - Fast, supports recursive search.

  • wfuzz wfuzz -w /usr/share/seclists/Discovery/Web-Content/raft-medium-directories.txt https://domain.com/api/FUZZ

  • ffuf - Fast: ffuf -c -w /usr/share/wordlists/dirb/big.txt -u http://10.10.10.10/FUZZ

  • uro (python): This isn't a spider but a tool that given the list of found URLs will to delete "duplicated" URLs.

  • Scavenger: Burp Extension to create a list of directories from the burp history of different pages

  • TrashCompactor: Remove URLs with duplicated functionalities (based on js imports)

  • Chamaleon: It uses wapalyzer to detect used technologies and select the wordlists to use.

Recommended dictionaries:

Note that anytime a new directory is discovered during brute-forcing or spidering, it should be Brute-Forced.

What to check on each file found

Special findings

While performing the spidering and brute-forcing you could find interesting things that you have to notice.

Interesting files

403 Forbidden/Basic Authentication/401 Unauthorized (bypass)

page403 & 401 Bypasses

502 Proxy Error

If any page responds with that code, it's probably a bad configured proxy. If you send a HTTP request like: GET https://google.com HTTP/1.1 (with the host header and other common headers), the proxy will try to access google.com and you will have found a SSRF.

NTLM Authentication - Info disclosure

If the running server asking for authentication is Windows or you find a login asking for your credentials (and asking for domain name), you can provoke an information disclosure. Send the header: “Authorization: NTLM TlRMTVNTUAABAAAAB4IIAAAAAAAAAAAAAAAAAAAAAAA=” and due to how the NTLM authentication works, the server will respond with internal info (IIS version, Windows version...) inside the header "WWW-Authenticate". You can automate this using the nmap plugin "http-ntlm-info.nse".

HTTP Redirect (CTF)

It is possible to put content inside a Redirection. This content won't be shown to the user (as the browser will execute the redirection) but something could be hidden in there.

Web Vulnerabilities Checking

Now that a comprehensive enumeration of the web application has been performed it's time to check for a lot of possible vulnerabilities. You can find the checklist here:

pageWeb Vulnerabilities Methodology

Find more info about web vulns in:

Monitor Pages for changes

You can use tools such as https://github.com/dgtlmoon/changedetection.io to monitor pages for modifications that might insert vulnerabilities.

If you are interested in hacking career and hack the unhackable - we are hiring! (fluent polish written and spoken required).

HackTricks Automatic Commands

Protocol_Name: Web    #Protocol Abbreviation if there is one.
Port_Number:  80,443     #Comma separated if there is more than one.
Protocol_Description: Web         #Protocol Abbreviation Spelled out

Entry_1:
  Name: Notes
  Description: Notes for Web
  Note: |
    https://book.hacktricks.xyz/pentesting/pentesting-web

Entry_2:
  Name: Quick Web Scan
  Description: Nikto and GoBuster
  Command: nikto -host {Web_Proto}://{IP}:{Web_Port} &&&& gobuster dir -w {Small_Dirlist} -u {Web_Proto}://{IP}:{Web_Port} && gobuster dir -w {Big_Dirlist} -u {Web_Proto}://{IP}:{Web_Port}

Entry_3:
  Name: Nikto
  Description: Basic Site Info via Nikto
  Command: nikto -host {Web_Proto}://{IP}:{Web_Port}

Entry_4:
  Name: WhatWeb
  Description: General purpose auto scanner
  Command: whatweb -a 4 {IP}

Entry_5:
  Name: Directory Brute Force Non-Recursive
  Description:  Non-Recursive Directory Brute Force
  Command: gobuster dir -w {Big_Dirlist} -u {Web_Proto}://{IP}:{Web_Port}

Entry_6:
  Name: Directory Brute Force Recursive
  Description: Recursive Directory Brute Force
  Command: python3 {Tool_Dir}dirsearch/dirsearch.py -w {Small_Dirlist} -e php,exe,sh,py,html,pl -f -t 20 -u {Web_Proto}://{IP}:{Web_Port} -r 10

Entry_7:
  Name: Directory Brute Force CGI
  Description: Common Gateway Interface Brute Force
  Command: gobuster dir -u {Web_Proto}://{IP}:{Web_Port}/ -w /usr/share/seclists/Discovery/Web-Content/CGIs.txt -s 200

Entry_8:
  Name: Nmap Web Vuln Scan
  Description: Tailored Nmap Scan for web Vulnerabilities
  Command: nmap -vv --reason -Pn -sV -p {Web_Port} --script=`banner,(http* or ssl*) and not (brute or broadcast or dos or external or http-slowloris* or fuzzer)` {IP}

Entry_9:
  Name: Drupal
  Description: Drupal Enumeration Notes
  Note: |
    git clone https://github.com/immunIT/drupwn.git for low hanging fruit and git clone https://github.com/droope/droopescan.git for deeper enumeration

Entry_10:
  Name: WordPress
  Description: WordPress Enumeration with WPScan
  Command: |
    ?What is the location of the wp-login.php? Example: /Yeet/cannon/wp-login.php
    wpscan --url {Web_Proto}://{IP}{1} --enumerate ap,at,cb,dbe && wpscan --url {Web_Proto}://{IP}{1} --enumerate u,tt,t,vp --passwords {Big_Passwordlist} -e

Entry_11:
  Name: WordPress Hydra Brute Force
  Description: Need User (admin is default)
  Command: hydra -l admin -P {Big_Passwordlist} {IP} -V http-form-post '/wp-login.php:log=^USER^&pwd=^PASS^&wp-submit=Log In&testcookie=1:S=Location'

Entry_12:
  Name: Ffuf Vhost
  Description: Simple Scan with Ffuf for discovering additional vhosts
  Command: ffuf -w {Subdomain_List}:FUZZ -u {Web_Proto}://{Domain_Name} -H "Host:FUZZ.{Domain_Name}" -c -mc all {Ffuf_Filters}
Learn AWS hacking from zero to hero with htARTE (HackTricks AWS Red Team Expert)!

Other ways to support HackTricks:

Last updated