docs: new document selfhost-search-engine.typ
This commit is contained in:
parent
92de1bc3f3
commit
609cff3bca
146
blog/selfhost-search-engine.typ
Normal file
146
blog/selfhost-search-engine.typ
Normal file
|
@ -0,0 +1,146 @@
|
||||||
|
#show link: underline
|
||||||
|
#set text(
|
||||||
|
font: "ETBembo",
|
||||||
|
size: 10pt)
|
||||||
|
#set page(
|
||||||
|
paper: "a4",
|
||||||
|
margin: 1cm,
|
||||||
|
)
|
||||||
|
#set par(
|
||||||
|
justify: true,
|
||||||
|
leading: 0.52em,
|
||||||
|
)
|
||||||
|
|
||||||
|
#align(center, text(20pt)[
|
||||||
|
*An overview on hoaxes*
|
||||||
|
])
|
||||||
|
|
||||||
|
= Introduction
|
||||||
|
#link("https://docs.searxng.org/")[SearXNG];, put in its own words, is a
|
||||||
|
'free internet metasearch engine'. Note that it describes itself as a
|
||||||
|
#emph[metasearch] engine specifically - unlike your traditional search
|
||||||
|
engine like Google or Bing, SearXNG does things a little bit
|
||||||
|
differently: It aggregrates the results produced by search services like
|
||||||
|
those aforementioned, and feeds them back to you.
|
||||||
|
|
||||||
|
Because of this key detail and a great deal of effort by those who’ve
|
||||||
|
helped shape it, SearXNG protects your privacy, and does so very well: -
|
||||||
|
Private data from requests going to the search services it aggregrates
|
||||||
|
results from is removed - It does #strong[not] forward #emph[anything]
|
||||||
|
to any third parties through search services - Private data is
|
||||||
|
#emph[also] removed from requests going to the results pages
|
||||||
|
|
||||||
|
Furthermore, SearXNG can be configured to use
|
||||||
|
#link("https://torproject.org")[Tor];.
|
||||||
|
|
||||||
|
However, the aspect of privacy isn’t the only great selling feature of
|
||||||
|
the engine; from my use of the engine so far, it’s also great
|
||||||
|
at…searching \(is that a surprise?). The fact that it’s a metasearch
|
||||||
|
engine plays a key role in this, as it provides SearXNG the ability to
|
||||||
|
pull content more efficiently and gives #emph[you] the ability to
|
||||||
|
further tailor your experience.
|
||||||
|
|
||||||
|
= Setting up SearXNG
|
||||||
|
== Installing the service
|
||||||
|
As you may have expected if you’ve used NixOS for a while, searxng is
|
||||||
|
packaged #emph[and] has a service on NixOS. This makes setting it up
|
||||||
|
just that much easier.
|
||||||
|
|
||||||
|
To get started, place somewhere in your #emph[system] config the
|
||||||
|
following:
|
||||||
|
|
||||||
|
```nix
|
||||||
|
{
|
||||||
|
# ...
|
||||||
|
services.searx = {
|
||||||
|
enable = true;
|
||||||
|
settings = {
|
||||||
|
server = {
|
||||||
|
port = 8888;
|
||||||
|
bind_address = "127.0.0.1";
|
||||||
|
secret_key = "@SEARX_SECRET_KEY@";
|
||||||
|
base_url = "https://search.devraza.duckdns.org/"; # replace with wherever you want to host yours
|
||||||
|
};
|
||||||
|
};
|
||||||
|
};
|
||||||
|
# ...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The snippet above starts the `searx` systemd service for listening on
|
||||||
|
port `8888`, and assumes a `base_url` of
|
||||||
|
`https://search.devraza.duckdns.org`.
|
||||||
|
|
||||||
|
Now that we’ve got the actual `searx` instance running, we can now set
|
||||||
|
up a reverse proxy allowing the service to be accessed remotely
|
||||||
|
\(whether this is within your local network or across the internet is up
|
||||||
|
to you).
|
||||||
|
|
||||||
|
== Setting up a reverse proxy
|
||||||
|
=== What is a reverse proxy?
|
||||||
|
Before I get started with the technical details of setting this up, I’d
|
||||||
|
like to briefly clarify what a reverse proxy exactly is \(to my
|
||||||
|
understanding).
|
||||||
|
|
||||||
|
Let’s get the wikipedia definition of reverse proxy out of the way
|
||||||
|
first:
|
||||||
|
|
||||||
|
#quote(block: true)[
|
||||||
|
\[…\] a reverse proxy is an application that sits in front of back-end
|
||||||
|
applications and forwards client requests to those applications. \[…\]
|
||||||
|
]
|
||||||
|
|
||||||
|
However, you might be confused as to what this actually means; I’ll give
|
||||||
|
an example of the usage of reverse proxies to better explain this:
|
||||||
|
|
||||||
|
- Suppose you’ve got a few services running on a server \(for
|
||||||
|
demonstration purposes, let’s name these `x`, `y` and `z`), each
|
||||||
|
running on their own unique port.
|
||||||
|
- Assuming you had a domain, and wanted to access all of these services
|
||||||
|
from their own unique sub-domains \(e.g.~`x.yourdomain.com`,
|
||||||
|
`y.yourdomain.com` and `z.yourdomain.com`), you would have to use a
|
||||||
|
reverse proxy.
|
||||||
|
- This reverse proxy would take in requests from clients going to
|
||||||
|
sub-domains, and forward these requests to the appropriate port on
|
||||||
|
your machine for the service being requested.
|
||||||
|
|
||||||
|
The concept should be clear now, if it wasn’t already.
|
||||||
|
|
||||||
|
=== Using NGINX to set up the reverse proxy
|
||||||
|
NGINX is a popular web server that supports the creation of virtual
|
||||||
|
hosts and the usage of reverse proxies. To accomodate our `searx`
|
||||||
|
instance, we append the following to our NixOS server configuration:
|
||||||
|
|
||||||
|
```nix
|
||||||
|
{
|
||||||
|
# ...
|
||||||
|
services.nginx = {
|
||||||
|
enable = true;
|
||||||
|
# any extra configuration here
|
||||||
|
virtualHosts = {
|
||||||
|
"search" = { # this can be anything, being an arbitrary identifier
|
||||||
|
forceSSL = true;
|
||||||
|
serverName = "search.yourdomain.com"; # replace this with whatever you're serving from
|
||||||
|
# SearX proxy
|
||||||
|
locations."/" = {
|
||||||
|
proxyPass = "http://${toString config.services.searx.settings.server.bind_address}:${toString config.services.searx.settings.server.port}";
|
||||||
|
proxyWebsockets = true;
|
||||||
|
recommendedProxySettings = true;
|
||||||
|
};
|
||||||
|
};
|
||||||
|
};
|
||||||
|
};
|
||||||
|
# ...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The expression highlighted above is used to dynamically adjust the location NGINX will forward requests to, depending on your `searx` config
|
||||||
|
|
||||||
|
After saving your changes and rebuilding your server’s system
|
||||||
|
configuration \(as usual), you should have a working #emph[private]
|
||||||
|
instance of SearXNG that you can access using the `serverName` you’ve
|
||||||
|
given it.
|
||||||
|
|
||||||
|
Set your browser to use this as your search engine using the relevant
|
||||||
|
documentation \(with Firefox this is as easy as right-clicking on the
|
||||||
|
URL after opening up the page and clicking a button). Enjoy!
|
Reference in a new issue