docs: new document selfhost-search-engine.typ
This commit is contained in:
parent
92de1bc3f3
commit
609cff3bca
1 changed files with 146 additions and 0 deletions
146
blog/selfhost-search-engine.typ
Normal file
146
blog/selfhost-search-engine.typ
Normal file
|
@ -0,0 +1,146 @@
|
|||
#show link: underline
|
||||
#set text(
|
||||
font: "ETBembo",
|
||||
size: 10pt)
|
||||
#set page(
|
||||
paper: "a4",
|
||||
margin: 1cm,
|
||||
)
|
||||
#set par(
|
||||
justify: true,
|
||||
leading: 0.52em,
|
||||
)
|
||||
|
||||
#align(center, text(20pt)[
|
||||
*An overview on hoaxes*
|
||||
])
|
||||
|
||||
= Introduction
|
||||
#link("https://docs.searxng.org/")[SearXNG];, put in its own words, is a
|
||||
'free internet metasearch engine'. Note that it describes itself as a
|
||||
#emph[metasearch] engine specifically - unlike your traditional search
|
||||
engine like Google or Bing, SearXNG does things a little bit
|
||||
differently: It aggregrates the results produced by search services like
|
||||
those aforementioned, and feeds them back to you.
|
||||
|
||||
Because of this key detail and a great deal of effort by those who’ve
|
||||
helped shape it, SearXNG protects your privacy, and does so very well: -
|
||||
Private data from requests going to the search services it aggregrates
|
||||
results from is removed - It does #strong[not] forward #emph[anything]
|
||||
to any third parties through search services - Private data is
|
||||
#emph[also] removed from requests going to the results pages
|
||||
|
||||
Furthermore, SearXNG can be configured to use
|
||||
#link("https://torproject.org")[Tor];.
|
||||
|
||||
However, the aspect of privacy isn’t the only great selling feature of
|
||||
the engine; from my use of the engine so far, it’s also great
|
||||
at…searching \(is that a surprise?). The fact that it’s a metasearch
|
||||
engine plays a key role in this, as it provides SearXNG the ability to
|
||||
pull content more efficiently and gives #emph[you] the ability to
|
||||
further tailor your experience.
|
||||
|
||||
= Setting up SearXNG
|
||||
== Installing the service
|
||||
As you may have expected if you’ve used NixOS for a while, searxng is
|
||||
packaged #emph[and] has a service on NixOS. This makes setting it up
|
||||
just that much easier.
|
||||
|
||||
To get started, place somewhere in your #emph[system] config the
|
||||
following:
|
||||
|
||||
```nix
|
||||
{
|
||||
# ...
|
||||
services.searx = {
|
||||
enable = true;
|
||||
settings = {
|
||||
server = {
|
||||
port = 8888;
|
||||
bind_address = "127.0.0.1";
|
||||
secret_key = "@SEARX_SECRET_KEY@";
|
||||
base_url = "https://search.devraza.duckdns.org/"; # replace with wherever you want to host yours
|
||||
};
|
||||
};
|
||||
};
|
||||
# ...
|
||||
}
|
||||
```
|
||||
|
||||
The snippet above starts the `searx` systemd service for listening on
|
||||
port `8888`, and assumes a `base_url` of
|
||||
`https://search.devraza.duckdns.org`.
|
||||
|
||||
Now that we’ve got the actual `searx` instance running, we can now set
|
||||
up a reverse proxy allowing the service to be accessed remotely
|
||||
\(whether this is within your local network or across the internet is up
|
||||
to you).
|
||||
|
||||
== Setting up a reverse proxy
|
||||
=== What is a reverse proxy?
|
||||
Before I get started with the technical details of setting this up, I’d
|
||||
like to briefly clarify what a reverse proxy exactly is \(to my
|
||||
understanding).
|
||||
|
||||
Let’s get the wikipedia definition of reverse proxy out of the way
|
||||
first:
|
||||
|
||||
#quote(block: true)[
|
||||
\[…\] a reverse proxy is an application that sits in front of back-end
|
||||
applications and forwards client requests to those applications. \[…\]
|
||||
]
|
||||
|
||||
However, you might be confused as to what this actually means; I’ll give
|
||||
an example of the usage of reverse proxies to better explain this:
|
||||
|
||||
- Suppose you’ve got a few services running on a server \(for
|
||||
demonstration purposes, let’s name these `x`, `y` and `z`), each
|
||||
running on their own unique port.
|
||||
- Assuming you had a domain, and wanted to access all of these services
|
||||
from their own unique sub-domains \(e.g.~`x.yourdomain.com`,
|
||||
`y.yourdomain.com` and `z.yourdomain.com`), you would have to use a
|
||||
reverse proxy.
|
||||
- This reverse proxy would take in requests from clients going to
|
||||
sub-domains, and forward these requests to the appropriate port on
|
||||
your machine for the service being requested.
|
||||
|
||||
The concept should be clear now, if it wasn’t already.
|
||||
|
||||
=== Using NGINX to set up the reverse proxy
|
||||
NGINX is a popular web server that supports the creation of virtual
|
||||
hosts and the usage of reverse proxies. To accomodate our `searx`
|
||||
instance, we append the following to our NixOS server configuration:
|
||||
|
||||
```nix
|
||||
{
|
||||
# ...
|
||||
services.nginx = {
|
||||
enable = true;
|
||||
# any extra configuration here
|
||||
virtualHosts = {
|
||||
"search" = { # this can be anything, being an arbitrary identifier
|
||||
forceSSL = true;
|
||||
serverName = "search.yourdomain.com"; # replace this with whatever you're serving from
|
||||
# SearX proxy
|
||||
locations."/" = {
|
||||
proxyPass = "http://${toString config.services.searx.settings.server.bind_address}:${toString config.services.searx.settings.server.port}";
|
||||
proxyWebsockets = true;
|
||||
recommendedProxySettings = true;
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
# ...
|
||||
}
|
||||
```
|
||||
|
||||
The expression highlighted above is used to dynamically adjust the location NGINX will forward requests to, depending on your `searx` config
|
||||
|
||||
After saving your changes and rebuilding your server’s system
|
||||
configuration \(as usual), you should have a working #emph[private]
|
||||
instance of SearXNG that you can access using the `serverName` you’ve
|
||||
given it.
|
||||
|
||||
Set your browser to use this as your search engine using the relevant
|
||||
documentation \(with Firefox this is as easy as right-clicking on the
|
||||
URL after opening up the page and clicking a button). Enjoy!
|
Reference in a new issue