WEB Site challenges



Web site challenges

This is not a doomsday article

It is about awareness of the risks, consequences and being able to see clearly the trade-offs that we make when creating and populating a website.


I have been maintaining a web site on a virtual machine, fully exposed to the internet. My site has the following features:
  • Fully functional, bi-directional email server
  • Apache2 web server responding to both HTTP and HTTPS.
    The site is a huge blog with: some 30k photos of wild life and car races and some technical articles like this one.

At first, innocently, I thought that a site like mine would not be attractive to hackers because it had no commercial value. Boy was I wrong!

So what is the value of my site i.e. why so many attempts to hack my site!

By analyzing the penetration attempts, I have come to some conclusions:
  • The mailer is extremely desirable to overtake
    Most likely it would be used as yet another spam originator.
  • The website is also desirable.
    Possibly to house a parasite web site to store and forward files. The web site could be turned into a jumping point to commit more hacks. This in itself I consider quite sinister since huge organizations with huge resources and expertise need sites all over the world to perpetrate their actions. It could also serve as a way to wash communications and making them untraceable.
  • The web instance itself is valuable as a pawn and stepping stone for Denial of Service attacks. The aim of the attacks is to paralyze your capability to render service and also the capability of the infrastructure that supports you.
  • And then there are the "trools" with nothing better todo.

What other value has my website?

There are other non destructive use of a site like mine for companies that gather intel about people, their use of the site, their interests etc. to derive marketing information.

What to do about it.

For the interactions that are invited.

Obviously, there are things that we invite when we build a site like mine. For instance, Google and Facebook will scan sites to gain information that may (or may not) have marketing value. However we can, as I do, consider these relationships mutually beneficial. I consider the publicity that my content gains to out weigh the small inconvenience of their scan. I do make sure not to put sensitive or personal data out there.

For the more shady and destructive.

It's clear to me that no webmaster will put up a site and not care if it is pirated or hacked. Webmasters worth their salt will take steps to reduce the risks as much as possible. I would not be surprised that neglecting to secure a site could eventually become a motive for lawsuits.

The arsenal to counteract hackers and spammers exist and is be quite extensive. However it all starts with a properly configured site. No, I don't know all that can be done but I do put in practice all that I have learned. The attacks become more and more sophisticated and the counters follow suit. It is a war out there on the internet and their is no doubt about it.

  • The OS. It is the first line of defense. Most Linux and Windows distros (OS releases) are configured by default to be as safe as possible. However they most be maintained to the latest versions in order to benefit from the latest security fixes that they incorporate.(Do the damn updates!) Passwords, there is a ton of papers written about them, at the end it all boils down to keeping the minimum number of users with live passwords and choosing the passwords wisely. On my system I make do with 3 users that have passwords.
  • Mailers are by far the most difficult piece of software to configure and make safe. There are many treatise about configuring them. I will only mention that you must absolutely make sure to only permit relaying of messages for domains that you absolutely trust. I would think that in general only the domain of the web site should be permitted. Also, care must be taken in regards to spam-mail. The solution that I implemented involves the use of "dovecot", "sendmail", "saslauth", "mimedefang-filter" and "spam-assassin". ( I will make their configuration files available on demand, the explanation are beyond the scope of this article)
  • "Apache2" like with "sendmail" have had many papers written about its proper configuration. It is well worth a good read. The distros install a pretty safe version. A note in passing about Google and Facebook. In the near future, they both will favor encrypted communications i.e. the use of HTTPS rather than HTTP. This is for both there protection as well as ours. You will need to either use a self signed security certificate or one issued by a recognized institution. The former is a times refused by browsers. There are some sources for free certificates.
  • Firewall your system! The OS distros are delivered for use in general purpose computer. This means that all kinds of applications can be used and this requires that many TCP and UDP ports be open "in case they are used". In my case, I only open ports for SSH, Apache2, Sendmail and Dovecot"

    In "ufw" parlance my firewall looks like this ( I added the comment column and replaced specific IP addresses with --names--

    Status: active
    To           Action From             Comment
    --           ------ ----             -------
    80/tcp       ALLOW  Anywhere         HTTP
    25/tcp       ALLOW  Anywhere         SMTP
    Anywhere     ALLOW  --this system--  ALL
    993/tcp      ALLOW  Anywhere         IMAPS
    465/tcp      ALLOW  Anywhere         Sendmail MTA
    587/tcp      ALLOW  Anywhere         Sendmail MTA
    53           ALLOW  --DNS server 1-- DNS
    53           ALLOW  --DNS server 2-- DNS
    443/tcp      ALLOW  Anywhere         HTTPS
    Anywhere     ALLOW  --VPN 1--        VPN access
    Anywhere     ALLOW  --VPN 2--        VPN access
    80/tcp (v6)  ALLOW  Anywhere (v6)    HTTP   
    25/tcp (v6)  ALLOW  Anywhere (v6)    SMTP
    993/tcp (v6) ALLOW  Anywhere (v6)    IMAPS
    465/tcp (v6) ALLOW  Anywhere (v6)    Sendmail MTA
    587/tcp (v6) ALLOW  Anywhere (v6)    Sendmail MTA  
    443/tcp (v6) ALLOW  Anywhere (v6)    HTTPS
  • "Fail2ban" is a tool that maintains a list of banned IPs to effectively stop those sites from trying to break in again after being flagged by an application or the firewall. It comes with a good configuration but I found that I had to add to it in order the gain some more isolation between my site and those attackers. This prevents a hacking site to try different variations of their hacks on my site and repeat attacks.

Who are these enemies!

Without becoming melodramatic about it, it is about companies, governments and personal enemies along with people with too much time on their hands. We should not be surprised to read that the most sophisticated pieces of hacking software were built and leaked out of government funded laboratories. These same laboratories are at the front-line of this war in both defensive and offensive capacities. For the companies, it is much simpler, it's all about the $. Companies use the internet to spy on the competition to gain any advantage possible on their competitors. It has always been thus, and now the internet is but an other tool at their disposal.

In regards to personal enemies, it gets even murkier. I don't want to go very deeply on this subject but we must all remember that anything we say on the internet can be used against us. For example, continually posting our whereabouts will inform any would-be thieves about the possibility of the house not being occupied.

How big is the problem ... back to my site

I keep some data on the hacking attempts on my site. The following graph illustrates some of the issues that my site faces on a daily basis. Culprit sites are "banned". They could only try once. In other words, these sites are new sites that tried malicious accesses.

  • "port-scanner": Sites that were banned after trying to find an open TCP port that they could try to exploit.
  • "sendmail": Sites that have tried to use my mailer for their own purpose.
  • "spam": The number of sites that successfully sent spam to my account
  • "apache404": Sites that harmfully configured an HTTP request to subvert my underlying database. (Although that was very prevalent in the beginning, it has been a reducing concern)
  • "dovecot": Attempts against the IMAP service.(Also much reduced as of lately)
  • "bannow": Is a bit of a catch all. Many attacks are hard to classify, this number contains sites that have tried to access the site via HTTP or HTTPS in a way to either gain insight into the workings of my site or sites that tried to use my own to bounce to an other site (proxy) and many more.

This graph is live. If you are interested would could come back and see how the situation evolves.

About the information

As mentioned above, their are risks and ways to mitigate them in ensuring the health of a website. Now about the information on the site.

What does the site say about me!

Make no mistake, the website solution, the type of information, the quality, quantity and the actual content of a website will say a lot about the owner and the webmaster. The first degree derivatives are almost direct and obvious unless extreme care is exercised. In my case it is about my hobbies so it is almost impossible to hide most of that information. I am exposing the fact that I like photography a lot, that I like to dabble in electronics that I have a university degree that I am not a native English speaker, my native tongue is French, that I have a lot of spare time etc. etc. These information make me a good target for certain types of publicity ... you can imagine the rest. Had I used my website to state religious or political opinions much more about me would have been available. In my mind, not a good situation!

In itself this is of little concern to the big companies. What they are more interested in, is the list of like minded people or those that share some of my affinities.

The care that must be exercised when making information available on websites is the same that must be exercised when using social media such as Twitter and Facebook. One can easily akin Facebook and Twitter personal pages to a mini-websites!

The rule of thumb about the longevity of the information on social networks and websites is FOREVER! Many have found that it is almost impossible to erase information put on the internet. Hey, I can still find stuff that I authored more than 20 years ago!!!

Some references

(Creation and modification dates: