#TIL : Using web proxy to bypass firewalls

Someday, you will be blocked by a firewall while trying crawling or accessing some website. The reason is they block your IP address from accessing the server.

One solution is using a web proxy (http proxy, socks4 or socks5) to bypass the firewall, by adding the middle-man server between you and target. It’s a bit unsecured but you could use for https site only.

Some HTTP Proxy supports https will stream TLS data from target to you (so don’t worry about proxy server can read you data). Btw, it only knows which domain and IP address you’re connecting.

To find a free proxy from the internet, try this service : https://gimmeproxy.com/

It provides a cool API to fetch new proxy from its database.

Example this endpoint will return JSON response including proxy anonymous, supports HTTPS, from Japan and minimum speed more than 100KB

1
http://gimmeproxy.com/api/getProxy?anonymityLevel=1&supportsHttps=false&country=JP&minSpeed=100

In case you need more requests per day, try a subscription (cancelable and refundable). I tried last days, and really like their service (although I cancelled subscription b/c I don’t need proxy anymore).

Break the rules ! ;)

#TIL : Enable reverse proxy in CentOS

CentOS with SELinux enabled by default will block any http proxy connection. So you have to enable this permission.

Temporary enable

1
$ /usr/sbin/setsebool httpd_can_network_connect 1

Permanent enable

1
$ /usr/sbin/setsebool -P httpd_can_network_connect 1

Webfonts Optimization Using Nginx

Context

Everytime you decided to use a webfont on your web, you think about Google Web Fonts, which is the best CDN for webfonts at this moment.

So why Google Fonts is good solution :

  • Have many popular and updated fonts
  • Global CDN
  • Easy UX to get started
  • Auto detects user’s browser then return supported font type (TTF, WOFF or WOFF2)

But

  • It’s a tracking endpoint (collect data to Google)
  • Network latency (you have a CDN in your country)

Solution

We use a simple Node server and NGINX with proxy cache mod to resolve this case.

Firstly, simple Node server will proxy your CSS request to CSS endpoint belonged to Google, then replace Google hostnames to your new hostname.

Secondly, NGINX tries to detect your browser via User-Agent header, then forward the same User-Agent type to origin Google server.

Lastly, NGINX server will proxy all fonts request to Google fonts server and cache the file in local system then serve same request later.

Source code is here : https://github.com/khanhicetea/google-fonts-resolver

Installation

1
2
$ npm install
$ node main.js [proxy_fonts_hostname] [port=3000]

Usage

NGINX configuration file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
proxy_cache_path /var/nginx/cdn_cache levels=1:2 use_temp_path=off keys_zone=cdn_cache:1024m max_size=1G inactive=14d;

map $http_user_agent $ua_fonts {
default 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36';
~(MSIE\ 8) 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0)';
~(MSIE|iPhone|Version) 'Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0)';
}

server {
listen 443 http2;
listen [::]:443 http2;

server_name [proxy_fonts_hostname];

ssl_certificate /etc/nginx/key/[proxy_fonts_hostname].crt;
ssl_certificate_key /etc/nginx/key/[proxy_fonts_hostname].key;

location /css {
proxy_pass http://127.0.0.1:3000;
proxy_http_version 1.1;
proxy_set_header User-Agent $ua_fonts;
proxy_redirect off;

proxy_cache cdn_cache;
proxy_cache_key "$request_uri$ua_fonts";
proxy_cache_lock on;
proxy_cache_lock_age 5s;
proxy_cache_lock_timeout 5s;
proxy_cache_methods GET;
proxy_cache_valid 200 1d;
proxy_cache_valid any 60s;

add_header X-Cache-Status $upstream_cache_status;

expires 7d;
}

location / {
proxy_pass https://fonts.gstatic.com/;
proxy_http_version 1.1;
proxy_set_header User-Agent $http_user_agent;
proxy_set_header Host fonts.gstatic.com;
proxy_redirect off;

proxy_cache cdn_cache;
proxy_cache_key "$request_uri";
proxy_cache_lock on;
proxy_cache_lock_age 5s;
proxy_cache_lock_timeout 5s;
proxy_cache_methods GET;
proxy_cache_valid 200 7d;
proxy_cache_valid any 60s;

add_header X-Cache-Status $upstream_cache_status;

expires 7d;
}
}
  1. Restart your NGINX server !
  2. Try to select a web font on Google Fonts
  3. Replace import url hostname to your hostname
  4. Enjoy the speed !!!!!

A BONUS TIP

Place your webfont import url to a preload link tag to ask the browser load the style immediate after loading HTML

1
<link rel="preload" href="https://[your-font-cdn]/css?family=Noto+Sans:400,400i,700,700i&amp;subset=vietnamese" type="style">

Benchmark

When I tried this solution with our CDN located in Vietnam, the result is better than 3-7 times (reduce latency and download time).