1
Fork 0
mirror of https://git.lolcat.ca/lolcat/4get.git synced 2024-12-03 23:42:16 -05:00
This commit is contained in:
lolcat 2023-11-07 08:04:56 -05:00
parent 64b090ee05
commit 785452873f
59 changed files with 2592 additions and 1277 deletions

132
README.md
View file

@ -3,12 +3,21 @@
# 4get
4get is a metasearch engine that doesn't suck (they live in our walls!)
## About 4get
# About 4get
https://4get.ca/about
## Try it out
# Try it out
https://4get.ca
# Totally unbiased comparison between alternatives
| | 4get | searx(ng) | librex | araa |
|----------------------------|-------------------------|-----------|-------------|----------|
| RAM usage | 200-400mb~ | 2GB~ | 200-400mb~ | 2GB~ |
| Does it suck | no (debunked by snopes) | yes | yes | a little |
| Does it work | ye | no | no | ye |
| Did the dev commit suicide | not until my 30s | idk | yes | no |
## Supported websites
1. Web
- DuckDuckGo
@ -36,7 +45,6 @@ https://4get.ca
4. News
- DuckDuckGo
- Brave
- Google
- Mojeek
5. Music
@ -55,15 +63,15 @@ https://4get.ca
More scrapers are coming soon. I currently want to add Google web/video/news search, HackerNews (durr orange site!!) and Qwant. A shopping and files tab is also in my todo list.
# Setup
# Installation
This section is still to-do. You will need to figure shit out for some of the apache2 and nginx stuff. Everything else should be OK.
## Apache
## Install on Apache
Login as root.
```sh
apt install apache2 certbot php-dom php-imagick imagemagick php-curl curl php-apcu git libapache2-mod-php python3-certbot-apache
apt install apache2 certbot php-imagick imagemagick php-curl curl php-apcu git libapache2-mod-php python3-certbot-apache
service apache2 start
a2enmod rewrite
```
@ -90,7 +98,7 @@ chmod 777 -R icons/
Restart the service for good measure... `service apache2 restart`
## NGINX
## Install on NGINX
Login as root.
@ -138,10 +146,54 @@ ln -s /etc/nginx/sites-available/4get.conf /etc/nginx/sites-available/4get.conf
Now test the nginx config with `nginx -t`, if it says that everything is good, restart nginx using `systemctl restart nginx`
## Setup encryption
## Install using Docker (lol u lazy fuck)
```
docker run -d -p 80:80 -e FOURGET_SERVER_NAME="4get.ca" -e FOURGET_SERVER_ADMIN_EMAIL="you@example.com" luuul/4get:latest
```
...Or with SSL:
```
docker run -d -p 443:443 -e FOURGET_SERVER_NAME="4get.ca" -e FOURGET_SERVER_ADMIN_EMAIL="you@example.com" -v /etc/letsencrypt/live/domain.tld:/etc/4get/certs luuul/4get:latest
```
replace enviroment variables FOURGET_SERVER_NAME and FOURGET_SERVER_ADMIN_EMAIL with relevant values
if the certificate files are not mounted to /etc/4get/certs the service listens to port 80
the certificate directory expects files named `cert.pem`, `chain.pem`, `privkey.pem`
## Install using Docker Compose
copy `docker-compose.yaml`
create a directory with images named `banners` for example and mount to `/var/www/html/4get/banner`
to serve custom banners
```
version: "3.7"
services:
fourget:
image: luuul/4get:latest
restart: always
environment:
- FOURGET_SERVER_NAME=4get.ca
- FOURGET_SERVER_ADMIN_EMAIL="you@example.com"
ports:
- "80:80"
- "443:443"
volumes:
- /etc/letsencrypt/live/domain.tld:/etc/4get/certs
- ./banners:/var/www/html/4get/banner
```
Replace relevant values and start with `docker-compose up -d`
# Encryption setup
I'm schizoid (as you should) so I'm gonna setup 4096bit key encryption. To complete this step, you need a domain or subdomain in your possession. Make sure that the DNS shit for your domain has propagated properly before continuing, because certbot is a piece of shit that will error out the ass once you reach 5 attempts under an hour.
### Apache
## Encryption setup on Apache
```sh
certbot --apache --rsa-key-size 4096 -d www.yourdomain.com -d yourdomain.com
@ -169,7 +221,7 @@ Restart again
service apache2 restart
```
### NGINX
## Encryption setup on NGINX
Generate a certificate for the domain using:
@ -180,15 +232,13 @@ certbot --nginx --key-type ecdsa -d www.yourdomain.com -d yourdomain.com
After doing that certbot should deploy the certificate automatically into your 4get nginx config file. It should be ready to use at that point.
## Captcha
# Jesse it is time to configure the server the fucking bots are back
Right now the setup for this shit is absolutely awful.
Wohoo the awful piece of shit setup and fiddling with 3 gazillion files is GONE. All you need to do to configure your shit is to go in `data/config.php` and edit the self-documenting configuration file. You can also specify proxies in `data/proxies/whatever.txt` and captcha images in `data/captcha/category/1.png`... I further explain how to deal with that garbage in the config file I mentionned.
Edit line 190 in `lib/captcha_gen.php` and specify your image sets. You can't disable the captcha right now lol. Just use a previous commit if you want to do that. Call me a shitcoder all you want I've had no energy lately. Images must be stored in `data/captcha`. Create a folder for each category. All files in there should be named from `1.png` to `321839.png`, for example.
# (Optional) Tor setup
## Tor Setup
1. Install tor.
1. Install `tor`.
2. Open `/etc/tor/torrc`
3. Go to the line that contains `HiddenServiceDir` and `HiddenServicePort`
4. Uncomment those 2 lines and set them like this:
@ -205,7 +255,7 @@ After you get your onion address you will need to configure your Apache or Nginx
I don't know to configure this shit on Apache so here is the NGINX one.
### NGINX
## Tor setup on NGINX
Open your current 4get NGINX config (that is under `/etc/nginx/sites-available/`) and append this to the end of the file:
@ -240,49 +290,5 @@ server {
Obviously replace `<youronionaddress>` by the onion address of `/var/lib/tor/4get/hostname` and then check if the nginx config is valid with `nginx -t` if yes, then restart the nginx service and try opening the onion address into the Tor Browser. You can see a real world example [here](https://git.zzls.xyz/Fijxu/etc-configs/src/branch/selfhost/nginx/sites-available/4get.zzls.xyz.conf)
## Docker Install
```
docker run -d -p 80:80 -e FOURGET_SERVER_NAME="4get.ca" -e FOURGET_SERVER_ADMIN_EMAIL="you@example.com" luuul/4get:latest
```
With SSL
```
docker run -d -p 443:443 -e FOURGET_SERVER_NAME="4get.ca" -e FOURGET_SERVER_ADMIN_EMAIL="you@example.com" -v /etc/letsencrypt/live/domain.tld:/etc/4get/certs luuul/4get:latest
```
replace enviroment variables FOURGET_SERVER_NAME and FOURGET_SERVER_ADMIN_EMAIL with relevant values
if the certificate files are not mounted to /etc/4get/certs the service listens to port 80
the certificate directory expects files named `cert.pem`, `chain.pem`, `privkey.pem`
## Docker compose
copy `docker-compose.yaml`
create a directory with images named `banners` for example and mount to `/var/www/html/4get/banner`
to serve custom banners
```
version: "3.7"
services:
fourget:
image: luuul/4get:latest
restart: always
environment:
- FOURGET_SERVER_NAME=4get.ca
- FOURGET_SERVER_ADMIN_EMAIL="you@example.com"
ports:
- "80:80"
- "443:443"
volumes:
- /etc/letsencrypt/live/domain.tld:/etc/4get/certs
- ./banners:/var/www/html/4get/banner
```
Replace relevant values and start with `docker-compose up -d`
# Contact
shit breaks all the time but I repair it all the time too. Email me here: will<at>lolcat(dot)ca

129
about.php
View file

@ -1,128 +1,23 @@
<?php
include "data/config.php";
include "lib/frontend.php";
$frontend = new frontend();
echo
'<!DOCTYPE html>' .
'<html lang="en">' .
'<head>' .
'<meta http-equiv="Content-Type" content="text/html;charset=utf-8">' .
'<title>About</title>' .
'<link rel="stylesheet" href="/static/style.css">' .
'<meta name="viewport" content="width=device-width,initial-scale=1">' .
'<meta name="robots" content="index,follow">' .
'<link rel="icon" type="image/x-icon" href="/favicon.ico">' .
'<meta name="description" content="4get.ca: About">' .
'<link rel="search" type="application/opensearchdescription+xml" title="4get" href="/opensearch.xml">' .
'</head>' .
'<body class="' . $frontend->getthemeclass(false) . 'about">';
include "data/instances.php";
$compiledinstancelist = "";
foreach ($instancelist as $instance)
{
$compiledinstancelist .= "<tr> <td>".$instance["name"]."</td>";
$compiledinstancelist .= "<td> <a href=\"".$instance["address"]["uri"]."\">".$instance["address"]["displayname"]."</a>";
foreach ($instance["altaddresses"] as $alt)
{
$compiledinstancelist .= "<a href=\"".$alt["uri"]."\">(".$alt["displayname"].")</a></td>";
}
$compiledinstancelist .= "</tr>";
}
$frontend->load(
"header_nofilters.html",
[
"title" => "About",
"class" => " class=\"about\""
]
);
$left =
'<a href="/" class="link">&lt; Go back</a>
<h1>Set as default search engine</h1>
<a href="#firefox"><h2 id="firefox">On Firefox and other Gecko based browsers</h2></a>
To set this as your default search engine on Firefox, right click the URL bar and select <div class="code-inline">Add "4get"</div>. Then, visit <a href="about:preferences#search" target="_BLANK" class="link">about:preferences#search</a> and select <div class="code-inline">4get</div> in the dropdown menu.
<a href="#chrome"><h2 id="chrome">On Chromium and Blink based browsers</h2></a>
Click the 3 superpositioned dots at the top right of the screen and click on <div class="code-inline">Settings</div>, then search for <div class="code-inline">default search engine</div>, or visit <a href="chrome://settings/searchEngines">chrome://settings/searchEngines</a>.<br><br>
Once you\'re there, click the pencil on the last entry under "Search engines" (it\'s probably DuckDuckGo). Once you do that, a popup will appear. Populate it with the following information:
<table>
<tr>
<td><b>Field</b></td>
<td><b>Value</b></td>
</tr>
<tr>
<td>Search engine</td>
<td>4get</td>
</tr>
<tr>
<td>Shortcut</td>
<td>4get</td>
</tr>
<tr>
<td>URL with %s in place of query</td>
<td>https://4get.ca/web?s=%s</td>
</tr>
</table>
Once that\'s done, click <div class="code-inline">Save</div>. Then, on the right handside of the newly created entry, open the dropdown menu and select <div class="code-inline">Make default</div>.
<h1>Frequently asked questions</h1>
<a href="#what-is-this"><h2 id="what-is-this">What is this?</h2></a>
This is a metasearch engine that gets results from other engines, and strips away all of the tracking parameters and Microsoft/globohomo bullshit they add. Most of the other alternatives to Google jack themselves off about being ""privacy respecting"" or whatever the fuck but it always turns out to be a total lie, and I just got fed up with their shit honestly. Alternatives like Searx or YaCy all fucking sucks so I made my own thing.
<a href="#goal"><h2 id="goal">My goal</h2></a>
Provide users with a privacy oriented, extremely lightweight, ad free, free as in freedom (and free beer!) way to search for documents around the internet, with minimal, optional javascript code. My long term goal would be to build my own index (that doesn\'t suck) and provide users with an unbiased search engine, with no political inclinations.
<a href="#logs"><h2 id="logs">Do you keep logs?</h2></a>
I store data temporarly to get the next page of results. This might include search queries, tokens and other parameters. These parameters are encrypted using <div class="code-inline">aes-256-gcm</div> on the serber, for which I give you a key (also known internally as <div class="code-inline">npt</div> token). When you make a request to get the next page, you supply the token, the data is decrypted and the request is fulfilled. This encrypted data is deleted after 15 minutes, or after it\'s used, whichever comes first.<br><br>
I <b>don\'t</b> log IP addresses, user agents, or anything else. The <div class="code-inline">npt</div> tokens are the only thing that are stored (in RAM, mind you), temporarly, encrypted.
<a href="#information-sharing"><h2 id="information-sharing">Do you share information with third parties?</h2></a>
Your search queries and supplied filters are shared with the scraper you chose (so I can get the search results, duh). I don\'t share anything else (that means I don\'t share your IP address, location, or anything of this kind). There is no way that site can know you\'re the one searching for something, <u>unless you send out a search query that de-anonymises you.</u> For example, a search query like "hello my full legal name is jonathan gallindo and i want pictures of cloacas" would definitively blow your cover. 4get doesn\'t contain ads or any third party javascript applets or trackers. I don\'t profile you, and quite frankly, I don\'t give a shit about what you search on there.<br><br>
TL;DR assume those websites can see what you search for, but can\'t see who you are (unless you\'re really dumb).
<a href="#hosting"><h2 id="hosting">Where is this website hosted?</h2></a>
This website is hosted on a Contabo shitbox in the United States.
<a href="#keyboard-shortcuts"><h2 id="keyboard-shortcuts">Keyboard shortcuts?</h2></a>
Use <div class="code-inline">/</div> to focus the search box.<br><br>
When the image viewer is open, you can use the following keybinds:<br>
<div class="code-inline">Up</div>, <div class="code-inline">Down</div>, <div class="code-inline">Left</div>, <div class="code-inline">Right</div> to rotate the image.<br>
<div class="code-inline">CTRL+Up</div>, <div class="code-inline">CTRL+Down</div>, <div class="code-inline">CTRL+Left</div>, <div class="code-inline">CTRL+Right</div> to mirror the image.<br>
<div class="code-inline">Escape</div> to exit the image viewer.
<a href="#instances"><h2 id="instances">Instances</h2></a>
4get is open source, anyone can create their own 4get instance! If you wish to add your website to this list, please <a href="https://lolcat.ca">contact me</a>.
<table>
<tr>
<td>Name</td>
<td>Address</td>
</tr>
'.$compiledinstancelist.'
</table>
<a href="#schizo"><h2 id="schizo">How can I trust you?</h2></a>
You just sort of have to take my word for it right now. If you\'d rather trust yourself instead of me (I believe in you!!), all of the code on this website is available trough my <a href="https://git.lolcat.ca/lolcat" class="link">git page</a> for you to host on your own machines. Just a reminder: if you\'re the sole user of your instance, it doesn\'t take immense brain power for Microshit to figure out you basically just switched IP addresses. Invite your friends to use your instance!
<a href="#donate"><h2 id="donate">Support the project</h2></a>
Donate to me trough ko-fi: <a href="https://ko-fi.com/lolcat" target="BLANK" rel="noreferrer">ko-fi.com/lolcat</a><br>
Please donate I sent myself a donation for testing if it works and it looks fucking dumb. Reasons to donate are listed on there. Thank you!
<a href="#contact"><h2 id="contact">I want to report abuse or have erotic roleplay trough email</h2></a>
I don\'t know about that second part but if you want to talk to me, just drop me an email...<br><br>
<b>Message to all DMCA enforcers:</b> I don\'t host any of the content. Everything you see here is <u>proxied</u> trough my shitbox with no moderation. Please reach out to the people hosting the infringing content instead.<br><br>
<a href="https://lolcat.ca" rel="dofollow" class="link">Click here to contact me!</a><br><br>
<a href="https://validator.w3.org/nu/?doc=https%3A%2F%2F4get.ca" title="W3 Valid!">
<img src="/static/icon/w3html.png" alt="Valid W3C HTML 4.01" width="88" height="31">
</a>';
// trim out whitespace
$left = explode("\n", $left);
explode(
"\n",
file_get_contents("template/about.html")
);
$out = "";

27
ami4get.php Normal file
View file

@ -0,0 +1,27 @@
<?php
header("Content-Type: application/json");
header("Access-Control-Allow-Origin: *");
include "data/config.php";
$bot_requests = apcu_fetch("captcha");
$real_requests = apcu_fetch("real_requests");
echo json_encode(
[
"status" => "ok",
"service" => "4get",
"server" => [
"name" => config::SERVER_NAME,
"description" => config::SERVER_LONG_DESCRIPTION,
"bot_protection" => config::BOT_PROTECTION,
"real_requests" => $real_requests === false ? 0 : $real_requests,
"bot_requests" => $bot_requests === false ? 0 : $bot_requests,
"api_enabled" => config::API_ENABLED,
"alt_addresses" => config::ALT_ADDRESSES,
"version" => config::VERSION
],
"instances" => config::INSTANCES
]
);

View file

@ -119,6 +119,11 @@
/_____/_/ /_/\__,_/ .___/\____/_/_/ /_/\__/____/
/_/
+ /ami4get
Tells you basic information about the 4get instance. CORS requests
are allowed on this endpoint.
+ /api/v1/web
+ &extendedsearch
When using the ddg(DuckDuckGo) scraper, you may make use of the

View file

@ -1,5 +1,6 @@
<?php
include "../../data/config.php";
new autocomplete();
class autocomplete{
@ -17,7 +18,7 @@ class autocomplete{
"yep" => "https://api.yep.com/ac/?query={searchTerms}",
"marginalia" => "https://search.marginalia.nu/suggest/?partial={searchTerms}",
"yt" => "https://suggestqueries-clients6.youtube.com/complete/search?client=youtube&q={searchTerms}",
"sc" => "https://api-v2.soundcloud.com/search/queries?q={searchTerms}&client_id=ArYppSEotE3YiXCO4Nsgid2LLqJutiww&limit=10&offset=0&linked_partitioning=1&app_version=1693487844&app_locale=en"
"sc" => "https://api-v2.soundcloud.com/search/queries?q={searchTerms}&client_id=" . config::SC_CLIENT_TOKEN . "&limit=10&offset=0&linked_partitioning=1&app_version=1693487844&app_locale=en"
];
/*
@ -107,7 +108,8 @@ class autocomplete{
[
$_GET["s"],
$json
]
],
JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES
);
break;
@ -132,7 +134,8 @@ class autocomplete{
[
$_GET["s"],
$json
]
],
JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES
);
break;
@ -150,7 +153,8 @@ class autocomplete{
[
$_GET["s"],
$json
]
],
JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES
);
break;
@ -162,7 +166,8 @@ class autocomplete{
[
$_GET["s"],
$json[1] // ensure it contains valid key 0
]
],
JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES
);
break;
}
@ -170,6 +175,7 @@ class autocomplete{
private function get($url, $query){
try{
$curlproc = curl_init();
$url = str_replace("{searchTerms}", urlencode($query), $url);
@ -204,11 +210,19 @@ class autocomplete{
curl_close($curlproc);
return $data;
}catch(Exception $error){
do404("Curl error: " . $error->getMessage());
}
}
private function do404($error){
echo json_encode(["error" => $error]);
echo json_encode(
["error" => $error],
JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES
);
die();
}
@ -218,7 +232,8 @@ class autocomplete{
[
$_GET["s"],
[]
]
],
JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES
);
die();
}

View file

@ -1,8 +1,14 @@
<?php
chdir("../../");
header("Content-Type: application/json");
chdir("../../");
include "data/config.php";
if(config::API_ENABLED === false){
echo json_encode(["status" => "The server administrator disabled the API!"]);
return;
}
include "lib/frontend.php";
$frontend = new frontend();

View file

@ -1,8 +1,14 @@
<?php
chdir("../../");
header("Content-Type: application/json");
chdir("../../");
include "data/config.php";
if(config::API_ENABLED === false){
echo json_encode(["status" => "The server administrator disabled the API!"]);
return;
}
include "lib/frontend.php";
$frontend = new frontend();

View file

@ -1,8 +1,14 @@
<?php
chdir("../../");
header("Content-Type: application/json");
chdir("../../");
include "data/config.php";
if(config::API_ENABLED === false){
echo json_encode(["status" => "The server administrator disabled the API!"]);
return;
}
include "lib/frontend.php";
$frontend = new frontend();

View file

@ -1,8 +1,14 @@
<?php
chdir("../../");
header("Content-Type: application/json");
chdir("../../");
include "data/config.php";
if(config::API_ENABLED === false){
echo json_encode(["status" => "The server administrator disabled the API!"]);
return;
}
include "lib/frontend.php";
$frontend = new frontend();

View file

@ -1,8 +1,14 @@
<?php
chdir("../../");
header("Content-Type: application/json");
chdir("../../");
include "data/config.php";
if(config::API_ENABLED === false){
echo json_encode(["status" => "The server administrator disabled the API!"]);
return;
}
include "lib/frontend.php";
$frontend = new frontend();
@ -21,7 +27,13 @@ new captcha($null, $null, $null, "web", false);
$get = $frontend->parsegetfilters($_GET, $filters);
if(!isset($_GET["extendedsearch"])){
if(
isset($_GET["extendedsearch"]) &&
$_GET["extendedsearch"] == "yes"
){
$get["extendedsearch"] = "yes";
}else{
$get["extendedsearch"] = "no";
}

View file

@ -7,6 +7,7 @@ if(!isset($_GET["s"])){
die();
}
include "data/config.php";
include "lib/curlproxy.php";
$proxy = new proxy();

View file

@ -1,5 +1,6 @@
<?php
include "data/config.php";
new sc_audio();
class sc_audio{

103
data/config.php Normal file
View file

@ -0,0 +1,103 @@
<?php
class config{
// Welcome to the 4get configuration file
// When updating your instance, please make sure this file isn't missing
// any parameters.
// 4get version. Please keep this updated
const VERSION = 5;
// Will be shown pretty much everywhere.
const SERVER_NAME = "4get";
// Will be shown in <meta> tag on home page
const SERVER_SHORT_DESCRIPTION = "They live in our walls!";
// Will be shown in server list ping (null for no description)
const SERVER_LONG_DESCRIPTION = null;
// Add your own themes in "static/themes". Set to "Dark" for default theme.
// Eg. To use "static/themes/Cream.css", specify "Cream".
const DEFAULT_THEME = "Dark";
// Enable the API?
const API_ENABLED = true;
// Bot protection
// 4get.ca has been hit with 250k bot reqs every single day for months
// you probably want to enable this if your instance is public...
// 0 = disabled
// 1 = ask for image captcha (requires image dataset & imagick 6.9.11-60)
// @TODO: 2 = invite only (users needs a pass)
const BOT_PROTECTION = 0;
// if BOT_PROTECTION is set to 1, specify the available datasets here
// images should be named from 1.png to X.png, and be 100x100 in size
// Eg. data/captcha/birds/1.png up to 2263.png
const CAPTCHA_DATASET = [
// example:
// ["birds", 2263],
// ["fumo_plushies", 1006],
// ["minecraft", 848]
];
// List of domains that point to your servers. Include your tor/i2p
// addresses here! Must be a valid URL. Won't affect links placed on
// the homepage.
const ALT_ADDRESSES = [
//"https://4get.alt-tld",
//"http://4getwebfrq5zr4sxugk6htxvawqehxtdgjrbcn2oslllcol2vepa23yd.onion"
];
// Known 4get instances. MUST use the https protocol if your instance uses
// it. Is used to generate a distributed list of instances.
// To appear in the list of an instance, contact the host and if everyone added
// eachother your serber should appear everywhere.
const INSTANCES = [
"https://4get.ca",
"https://4get.zzls.xyz",
"https://4get.silly.computer",
"https://4g.opnxng.com",
"https://4get.konakona.moe"
];
// Default user agent to use for scraper requests. Sometimes ignored to get specific webpages
// Changing this might break things.
const USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/119.0";
// Proxy pool assignments for each scraper
// false = Use server's raw IP
// string = will load a proxy list from data/proxies
// Eg. "onion" will load data/proxies/onion.txt
const PROXY_DDG = false; // duckduckgo
const PROXY_BRAVE = false;
const PROXY_FB = false; // facebook
const PROXY_GOOGLE = false;
const PROXY_MARGINALIA = false;
const PROXY_MOJEEK = false;
const PROXY_SC = false; // soundcloud
const PROXY_WIBY = false;
const PROXY_YT = false; // youtube
const PROXY_YEP = false;
const PROXY_PINTEREST = false;
const PROXY_FTM = false; // findthatmeme
const PROXY_IMGUR = false;
const PROXY_YANDEX_W = false; // yandex web
const PROXY_YANDEX_I = false; // yandex images
const PROXY_YANDEX_V = false; // yandex videos
//
// Scraper-specific parameters
//
// SOUNDCLOUD
// Get these parameters by making a search on soundcloud with network
// tab open, then filter URLs using "search?q=". (No need to login)
const SC_USER_ID = "143860-454480-469473-289775";
const SC_CLIENT_TOKEN = "qwfvRfz8PCoa2NldZALK7hhZFIH24Wyx";
// MARGINALIA
// Get an API key by contacting the Marginalia.nu maintainer. The "public" key
// works but is almost always rate-limited.
const MARGINALIA_API_KEY = "public";
}

View file

@ -1,62 +0,0 @@
<?php
// this file exists to separate instance data from the actual about page
// HTML, and to make it easier to add/modify instances cleanly.
$instancelist = [
[
"name" => "lolcat's instance (master)",
"address" => [
"uri" => "https://4get.ca/",
"displayname" => "4get.ca"
],
"altaddresses" => [
[
// all these address blocks will be linked in parentheses
// e.g. 4get.ca (tor) (i2p) etc.
"uri" => "http://4getwebfrq5zr4sxugk6htxvawqehxtdgjrbcn2oslllcol2vepa23yd.onion",
"displayname" => "tor"
]
]
],
[
"name" => "zzls's Chilean instance",
"address" => [
"uri" => "https://4get.zzls.xyz/",
"displayname" => "4get.zzls.xyz"
],
"altaddresses" => [
[
"uri" => "http://4get.zzlsghu6mvvwyy75mvga6gaf4znbp3erk5xwfzedb4gg6qqh2j6rlvid.onion",
"displayname" => "tor"
]
]
],
[
"name" => "zzls's United States instance",
"address" => [
"uri" => "https://4getus.zzls.xyz/",
"displayname" => "4getus.zzls.xyz"
],
"altaddresses" => [
[
"uri" => "http://4getus.zzlsghu6mvvwyy75mvga6gaf4znbp3erk5xwfzedb4gg6qqh2j6rlvid.onion",
"displayname" => "tor"
]
]
],
[
"name" => "4get on a silly computer",
"address" => [
"uri" => "https://4get.silly.computer",
"displayname" => "4get.silly.computer"
],
"altaddresses" => [
[
"uri" => "https://4get.cynic.moe/",
"displayname" => "fallback domain"
]
]
]
]
?>

3
data/proxies/.gitignore vendored Normal file
View file

@ -0,0 +1,3 @@
*
!.gitignore
!onion.txt

13
data/proxies/onion.txt Normal file
View file

@ -0,0 +1,13 @@
# Specify proxies by following this format:
# <type>:<address>:<port>:<username>:<password>
#
# Examples:
# https:1.3.3.7:6969:abcd:efg
# socks4:1.2.3.4:8080::
# raw_ip::::
#
# Available types:
# raw_ip, http, https, socks4, socks5, socks4a, socks5_hostname
# Local tor proxy
socks5:localhost:9050::

View file

@ -6,6 +6,7 @@ if(!isset($_GET["s"])){
die();
}
include "data/config.php";
new favicon($_GET["s"]);
class favicon{

View file

@ -3,6 +3,8 @@
/*
Initialize random shit
*/
include "data/config.php";
include "lib/frontend.php";
$frontend = new frontend();
@ -26,20 +28,7 @@ try{
}catch(Exception $error){
echo
$frontend->drawerror(
"Shit",
'This scraper returned an error:' .
'<div class="code">' . htmlspecialchars($error->getMessage()) . '</div>' .
'Things you can try:' .
'<ul>' .
'<li>Use a different scraper</li>' .
'<li>Remove keywords that could cause errors</li>' .
'<li>Use another 4get instance</li>' .
'</ul><br>' .
'If the error persists, please <a href="/about">contact the administrator</a>.'
);
die();
$frontend->drawscrapererror($error->getMessage(), $get, "images");
}
if(count($results["image"]) === 0){

View file

@ -1,5 +1,6 @@
<?php
include "data/config.php";
include "lib/frontend.php";
$frontend = new frontend();
@ -8,7 +9,7 @@ $images = glob("banner/*");
echo $frontend->load(
"home.html",
[
"body_class" => $frontend->getthemeclass(false),
"server_short_description" => htmlspecialchars(config::SERVER_SHORT_DESCRIPTION),
"banner" => $images[rand(0, count($images) - 1)]
]
);

55
instances.php Normal file
View file

@ -0,0 +1,55 @@
<?php
include "lib/frontend.php";
$frontend = new frontend();
include "data/config.php";
$params = "";
$first = true;
foreach($_GET as $key => $value){
if(
!is_string($value) ||
$key == "target"
){
continue;
}
if($first === true){
$first = false;
$params = "?";
}else{
$params .= "&";
}
$params .= urlencode($key) . "=" . urlencode($value);
}
if(
!isset($_GET["target"]) ||
!is_string($_GET["target"])
){
$target = "";
}else{
$target = "/" . urlencode($_GET["target"]);
}
$instances = "";
foreach(config::INSTANCES as $instance){
$instances .= '<tr><td class="expand"><a href="' . htmlspecialchars($instance) . $target . $params . '" target="_BLANK" rel="noreferer">' . htmlspecialchars($instance) . '</a></td></tr>';
}
echo
$frontend->load(
"instances.html",
[
"instances_html" => $instances
]
);

197
lib/backend.php Normal file
View file

@ -0,0 +1,197 @@
<?php
class backend{
public function __construct($scraper){
$this->scraper = $scraper;
$this->requestid = apcu_inc("real_requests");
}
/*
Proxy stuff
*/
public function get_ip(){
$pool = constant("config::PROXY_" . strtoupper($this->scraper));
if($pool === false){
// we don't want a proxy, fuck off!
return 'raw_ip::::';
}
// indent
$proxy_index_raw = apcu_inc("p." . $this->scraper);
$proxylist = file_get_contents("data/proxies/" . $pool . ".txt");
$proxylist = explode("\n", $proxylist);
// ignore empty or commented lines
$proxylist = array_filter($proxylist, function($entry){
$entry = ltrim($entry);
return strlen($entry) > 0 && substr($entry, 0, 1) != "#";
});
$proxylist = array_values($proxylist);
return $proxylist[$proxy_index_raw % count($proxylist)];
}
// this function is also called directly on nextpage
public function assign_proxy(&$curlproc, $ip){
// parse proxy line
[
$type,
$address,
$port,
$username,
$password
] = explode(":", $ip, 5);
switch($type){
case "raw_ip":
return;
break;
case "http":
case "https":
curl_setopt($curlproc, CURLOPT_PROXYTYPE, CURLPROXY_HTTP);
curl_setopt($curlproc, CURLOPT_PROXY, $type . "://" . $address . ":" . $port);
break;
case "socks4":
curl_setopt($curlproc, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4);
curl_setopt($curlproc, CURLOPT_PROXY, $address . ":" . $port);
break;
case "socks5":
curl_setopt($curlproc, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5);
curl_setopt($curlproc, CURLOPT_PROXY, $address . ":" . $port);
break;
case "socks4a":
curl_setopt($curlproc, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4A);
curl_setopt($curlproc, CURLOPT_PROXY, $address . ":" . $port);
break;
case "socks5_hostname":
curl_setopt($curlproc, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME);
curl_setopt($curlproc, CURLOPT_PROXY, $address . ":" . $port);
break;
}
if($username != ""){
curl_setopt($curlproc, CURLOPT_PROXYUSERPWD, $username . ":" . $password);
}
}
/*
Next page stuff
*/
public function store($payload, $page, $proxy){
$page = $page[0];
$password = random_bytes(256); // 2048 bit
$salt = random_bytes(16);
$key = hash_pbkdf2("sha512", $password, $salt, 20000, 32, true);
$iv =
random_bytes(
openssl_cipher_iv_length("aes-256-gcm")
);
$tag = "";
$out = openssl_encrypt($payload, "aes-256-gcm", $key, OPENSSL_RAW_DATA, $iv, $tag, "", 16);
$key = apcu_inc("key", 1);
apcu_store(
$page . "." .
$this->scraper .
$this->requestid,
gzdeflate($proxy . "," . $salt.$iv.$out.$tag),
900 // cache information for 15 minutes blaze it
);
return
$this->scraper . $this->requestid . "." .
rtrim(strtr(base64_encode($password), '+/', '-_'), '=');
}
public function get($npt, $page){
$page = $page[0];
$explode = explode(".", $npt, 2);
if(count($explode) !== 2){
throw new Exception("Malformed nextPageToken!");
}
$apcu = $page . "." . $explode[0];
$key = $explode[1];
$payload = apcu_fetch($apcu);
if($payload === false){
throw new Exception("The nextPageToken is invalid or has expired!");
}
$key =
base64_decode(
str_pad(
strtr($key, '-_', '+/'),
strlen($key) % 4,
'=',
STR_PAD_RIGHT
)
);
$payload = gzinflate($payload);
// get proxy
[
$proxy,
$payload
] = explode(",", $payload, 2);
$key =
hash_pbkdf2(
"sha512",
$key,
substr($payload, 0, 16), // salt
20000,
32,
true
);
$ivlen = openssl_cipher_iv_length("aes-256-gcm");
$payload =
openssl_decrypt(
substr(
$payload,
16 + $ivlen,
-16
),
"aes-256-gcm",
$key,
OPENSSL_RAW_DATA,
substr($payload, 16, $ivlen),
substr($payload, -16)
);
if($payload === false){
throw new Exception("The nextPageToken is invalid or has expired!");
}
// remove the key after using
apcu_delete($apcu);
return [$payload, $proxy];
}
}

View file

@ -4,6 +4,19 @@ class captcha{
public function __construct($frontend, $get, $filters, $page, $output){
// check if we want captcha
if(config::BOT_PROTECTION !== 1){
if($output === true){
$frontend->loadheader(
$get,
$filters,
$page
);
}
return;
}
/*
Validate cookie, if it exists
*/
@ -46,6 +59,7 @@ class captcha{
if($output === false){
http_response_code(429); // too many reqs
echo json_encode([
"status" => "The \"pass\" token in your cookies is missing or has expired!!"
]);
@ -184,15 +198,6 @@ class captcha{
}
}
/*
Generate random grid data to pass to captcha.php
*/
$dataset = [
["birds", 2263],
["fumo_plushies", 1006],
["minecraft", 848]
];
// get the positions for the answers
// will return between 3 and 6 answer positions
$range = range(0, 15);
@ -216,17 +221,18 @@ class captcha{
}
// choose a dataset
$choosen = &$dataset[random_int(0, count($dataset) - 1)];
$c = count(config::CAPTCHA_DATASET);
$choosen = config::CAPTCHA_DATASET[random_int(0, $c - 1)];
$choices = [];
for($i=0; $i<count($dataset); $i++){
for($i=0; $i<$c; $i++){
if($dataset[$i][0] == $choosen[0]){
if(config::CAPTCHA_DATASET[$i][0] == $choosen[0]){
continue;
}
$choices[] = $dataset[$i];
$choices[] = config::CAPTCHA_DATASET[$i];
}
// generate grid data

View file

@ -152,7 +152,7 @@ class proxy{
$curl,
CURLOPT_HTTPHEADER,
[
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/116.0",
"User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip, deflate",
@ -180,7 +180,7 @@ class proxy{
$curl,
CURLOPT_HTTPHEADER,
[
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/116.0",
"User-Agent: " . config::USER_AGENT,
"Accept: image/avif,image/webp,*/*",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip, deflate",
@ -379,7 +379,7 @@ class proxy{
$curl,
CURLOPT_HTTPHEADER,
[
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/116.0",
"User-Agent: " . config::USER_AGENT,
"Accept: image/avif,image/webp,*/*",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip, deflate, br",
@ -395,7 +395,7 @@ class proxy{
$curl,
CURLOPT_HTTPHEADER,
[
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/116.0",
"User-Agent: " . config::USER_AGENT,
"Accept: audio/webm,audio/ogg,audio/wav,audio/*;q=0.9,application/ogg;q=0.7,video/*;q=0.6,*/*;q=0.5",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip, deflate, br",

View file

@ -4,6 +4,41 @@ class frontend{
public function load($template, $replacements = []){
$replacements["server_name"] = htmlspecialchars(config::SERVER_NAME);
$replacements["version"] = config::VERSION;
if(isset($_COOKIE["theme"])){
$theme = str_replace(["/". "."], "", $_COOKIE["theme"]);
if(
$theme != "Dark" &&
!is_file("static/themes/" . $theme . ".css")
){
$theme = config::DEFAULT_THEME;
}
}else{
$theme = config::DEFAULT_THEME;
}
if($theme != "Dark"){
$replacements["style"] = '<link rel="stylesheet" href="/static/themes/' . $theme . '.css?v' . config::VERSION . '">';
}else{
$replacements["style"] = "";
}
if(isset($_COOKIE["scraper_ac"])){
$replacements["ac"] = '?ac=' . htmlspecialchars($_COOKIE["scraper_ac"]);
}else{
$replacements["ac"] = '';
}
$handle = fopen("template/{$template}", "r");
$data = fread($handle, filesize("template/{$template}"));
fclose($handle);
@ -29,30 +64,6 @@ class frontend{
return trim($html);
}
public function getthemeclass($raw = true){
if(
isset($_COOKIE["theme"]) &&
$_COOKIE["theme"] == "cream"
){
$body_class = "theme-white ";
}else{
$body_class = "";
}
if(
$raw &&
$body_class != ""
){
return ' class="' . rtrim($body_class) . '"';
}
return $body_class;
}
public function loadheader(array $get, array $filters, string $page){
echo
@ -62,8 +73,7 @@ class frontend{
"index" => "no",
"search" => htmlspecialchars($get["s"]),
"tabs" => $this->generatehtmltabs($page, $get["s"]),
"filters" => $this->generatehtmlfilters($filters, $get),
"body_class" => $this->getthemeclass()
"filters" => $this->generatehtmlfilters($filters, $get)
]);
if(
@ -74,7 +84,6 @@ class frontend{
){
// bot detected !!
echo
$this->drawerror(
"Tshh, blocked!",
'You were blocked from viewing this page. If you wish to scrape data from 4get, please consider running <a href="https://git.lolcat.ca/lolcat/4get" rel="noreferrer nofollow">your own 4get instance</a> or using <a href="/api.txt">the API</a>.',
@ -85,7 +94,7 @@ class frontend{
public function drawerror($title, $error){
return
echo
$this->load("search.html", [
"class" => "",
"right-left" => "",
@ -96,6 +105,23 @@ class frontend{
$error .
'</div>'
]);
die();
}
public function drawscrapererror($error, $get, $target){
$this->drawerror(
"Shit",
'This scraper returned an error:' .
'<div class="code">' . htmlspecialchars($error) . '</div>' .
'Things you can try:' .
'<ul>' .
'<li>Use a different scraper</li>' .
'<li>Remove keywords that could cause errors</li>' .
'<li><a href="/instances?target=' . $target . "&" . $this->buildquery($get, false) . '">Try your search on another 4get instance</a></li>' .
'</ul><br>' .
'If the error persists, please <a href="/about">contact the administrator</a>.'
);
}
public function drawtextresult($site, $greentext = null, $duration = null, $keywords, $tabindex = true, $customhtml = null){
@ -819,30 +845,7 @@ class frontend{
public function getscraperfilters($page){
$get_scraper = null;
switch($page){
case "web":
$get_scraper = isset($_COOKIE["scraper_web"]) ? $_COOKIE["scraper_web"] : null;
break;
case "images":
$get_scraper = isset($_COOKIE["scraper_images"]) ? $_COOKIE["scraper_images"] : null;
break;
case "videos":
$get_scraper = isset($_COOKIE["scraper_videos"]) ? $_COOKIE["scraper_videos"] : null;
break;
case "news":
$get_scraper = isset($_COOKIE["scraper_news"]) ? $_COOKIE["scraper_news"] : null;
break;
case "music":
$get_scraper = isset($_COOKIE["scraper_news"]) ? $_COOKIE["scraper_news"] : null;
break;
}
$get_scraper = isset($_COOKIE["scraper_$page"]) ? $_COOKIE["scraper_$page"] : null;
if(
isset($_GET["scraper"]) &&
@ -1148,32 +1151,8 @@ class frontend{
break;
case "_SEARCH":
// get search string & bang
$sanitized[$parameter] = trim($sanitized[$parameter]);
$sanitized["bang"] = "";
if(
strlen($sanitized[$parameter]) !== 0 &&
$sanitized[$parameter][0] == "!"
){
$sanitized[$parameter] = explode(" ", $sanitized[$parameter], 2);
$sanitized["bang"] = trim($sanitized[$parameter][0]);
if(count($sanitized[$parameter]) === 2){
$sanitized[$parameter] = trim($sanitized[$parameter][1]);
}else{
$sanitized[$parameter] = "";
}
$sanitized["bang"] = ltrim($sanitized["bang"], "!");
}
$sanitized[$parameter] = ltrim($sanitized[$parameter], "! \n\r\t\v\x00");
// get search string
$sanitized["s"] = trim($sanitized[$parameter]);
}
}
}

View file

@ -442,5 +442,3 @@ class fuckhtml{
return json_decode($json_out, true);
}
}
?>

View file

@ -1,106 +0,0 @@
<?php
class nextpage{
public function __construct($scraper){
$this->scraper = $scraper;
}
public function store($payload, $page){
$page = $page[0];
$password = random_bytes(256); // 2048 bit
$salt = random_bytes(16);
$key = hash_pbkdf2("sha512", $password, $salt, 20000, 32, true);
$iv =
random_bytes(
openssl_cipher_iv_length("aes-256-gcm")
);
$tag = "";
$out = openssl_encrypt($payload, "aes-256-gcm", $key, OPENSSL_RAW_DATA, $iv, $tag, "", 16);
$key = apcu_inc("key", 1);
apcu_store(
$page . "." .
$this->scraper .
(string)$key,
gzdeflate($salt.$iv.$out.$tag),
900 // cache information for 15 minutes blaze it
);
return
$this->scraper . $key . "." .
rtrim(strtr(base64_encode($password), '+/', '-_'), '=');
}
public function get($npt, $page){
$page = $page[0];
$explode = explode(".", $npt, 2);
if(count($explode) !== 2){
throw new Exception("Malformed nextPageToken!");
}
$apcu = $page . "." . $explode[0];
$key = $explode[1];
$payload = apcu_fetch($apcu);
if($payload === false){
throw new Exception("The nextPageToken is invalid or has expired!");
}
$key =
base64_decode(
str_pad(
strtr($key, '-_', '+/'),
strlen($key) % 4,
'=',
STR_PAD_RIGHT
)
);
$payload = gzinflate($payload);
$key =
hash_pbkdf2(
"sha512",
$key,
substr($payload, 0, 16), // salt
20000,
32,
true
);
$ivlen = openssl_cipher_iv_length("aes-256-gcm");
$payload =
openssl_decrypt(
substr(
$payload,
16 + $ivlen,
-16
),
"aes-256-gcm",
$key,
OPENSSL_RAW_DATA,
substr($payload, 16, $ivlen),
substr($payload, -16)
);
if($payload === false){
throw new Exception("The nextPageToken is invalid or has expired!");
}
// remove the key after using
apcu_delete($apcu);
return $payload;
}
}

View file

@ -3,6 +3,8 @@
/*
Initialize random shit
*/
include "data/config.php";
include "lib/frontend.php";
$frontend = new frontend();
@ -28,20 +30,7 @@ try{
}catch(Exception $error){
echo
$frontend->drawerror(
"Shit",
'This scraper returned an error:' .
'<div class="code">' . htmlspecialchars($error->getMessage()) . '</div>' .
'Things you can try:' .
'<ul>' .
'<li>Use a different scraper</li>' .
'<li>Remove keywords that could cause errors</li>' .
'<li>Use another 4get instance</li>' .
'</ul><br>' .
'If the error persists, please <a href="/about">contact the administrator</a>.'
);
die();
$frontend->drawscrapererror($error->getMessage(), $get, "music");
}
$categories = [

View file

@ -3,6 +3,8 @@
/*
Initialize random shit
*/
include "data/config.php";
include "lib/frontend.php";
$frontend = new frontend();
@ -28,20 +30,7 @@ try{
}catch(Exception $error){
echo
$frontend->drawerror(
"Shit",
'This scraper returned an error:' .
'<div class="code">' . htmlspecialchars($error->getMessage()) . '</div>' .
'Things you can try:' .
'<ul>' .
'<li>Use a different scraper</li>' .
'<li>Remove keywords that could cause errors</li>' .
'<li>Use another 4get instance</li>' .
'</ul><br>' .
'If the error persists, please <a href="/about">contact the administrator</a>.'
);
die();
$frontend->drawscrapererror($error->getMessage(), $get, "news");
}
/*

29
opensearch.php Normal file
View file

@ -0,0 +1,29 @@
<?php
header("Content-Type: application/xml");
include "data/config.php";
$domain =
htmlspecialchars(
(strpos(strtolower($_SERVER['SERVER_PROTOCOL']), 'https') === false ? 'http' : 'https') .
'://' . $_SERVER["HTTP_HOST"]
);
echo
'<?xml version="1.0" encoding="UTF-8"?>' .
'<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">' .
'<ShortName>' . htmlspecialchars(config::SERVER_NAME) . '</ShortName>' .
'<InputEncoding>UTF-8</InputEncoding>' .
'<Image width="16" height="16">' . $domain . '/favicon.ico</Image>' .
'<Url type="text/html" method="GET" template="' . $domain . '/web?s={searchTerms}"/>';
if(
isset($_GET["ac"]) &&
is_string($_GET["ac"]) &&
$_GET["ac"] != "disabled"
){
echo '<Url rel="suggestions" type="application/x-suggestions+json" template="' . $domain . '/api/v1/ac?s={searchTerms}&amp;scraper=' . htmlspecialchars($_GET["ac"]) . '"/>';
}
echo '</OpenSearchDescription>';

View file

@ -1,5 +1,6 @@
<?php
include "data/config.php";
include "lib/curlproxy.php";
$proxy = new proxy();

View file

@ -7,8 +7,8 @@ class brave{
include "lib/fuckhtml.php";
$this->fuckhtml = new fuckhtml();
include "lib/nextpage.php";
$this->nextpage = new nextpage("brave");
include "lib/backend.php";
$this->backend = new backend("brave");
}
public function getfilters($page){
@ -138,13 +138,20 @@ class brave{
"maybe" => "Maybe",
"no" => "No"
]
],
"spellcheck" => [
"display" => "Spellcheck",
"option" => [
"yes" => "Yes",
"no" => "No"
]
]
];
break;
}
}
private function get($url, $get = [], $nsfw, $country){
private function get($proxy, $url, $get = [], $nsfw, $country){
switch($nsfw){
@ -159,7 +166,7 @@ class brave{
}
$headers = [
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
"User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -191,10 +198,11 @@ class brave{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
throw new Exception(curl_error($curlproc));
}
@ -207,7 +215,9 @@ class brave{
if($get["npt"]){
// get next page data
$q = json_decode($this->nextpage->get($get["npt"], "web"), true);
[$q, $proxy] = $this->backend->get($get["npt"], "web");
$q = json_decode($q, true);
$search = $q["q"];
$q["spellcheck"] = "0";
@ -222,7 +232,6 @@ class brave{
// get _GET data instead
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
@ -230,9 +239,10 @@ class brave{
if(strlen($search) > 2048){
throw new Exception("Search query is too long!");
throw new Exception("Search term is too long!");
}
$proxy = $this->backend->get_ip();
$nsfw = $get["nsfw"];
$country = $get["country"];
$older = $get["older"];
@ -288,6 +298,7 @@ class brave{
try{
$html =
$this->get(
$proxy,
"https://search.brave.com/search",
$q,
$nsfw,
@ -361,9 +372,10 @@ class brave{
$q["country"] = $country;
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
json_encode($q),
"web"
"web",
$proxy
);
}
}
@ -759,7 +771,9 @@ class brave{
"description" =>
isset($result["review"]["description"]) ?
$this->limitstrlen(
strip_tags(
$result["review"]["description"]
)
) :
$this->titledots(
$this->fuckhtml
@ -839,6 +853,32 @@ class brave{
"value" => $this->titledots($info["long_desc"])
];
}
// parse ratings
if(
isset($info["ratings"]) &&
$info["ratings"] != "void 0"
){
$description[] = [
"type" => "title",
"value" => "Ratings"
];
foreach($info["ratings"] as $rating){
$description[] = [
"type" => "link",
"url" => $rating["profile"]["url"],
"value" => $rating["profile"]["name"]
];
$description[] = [
"type" => "text",
"value" => ": " . $rating["ratingValue"] . "/" . $rating["bestRating"] . "\n"
];
}
}
}
$table = [];
@ -908,9 +948,9 @@ class brave{
$out["video"][] = [
"title" => $this->titledots($video["title"]),
"description" => $this->titledots($video["description"]),
"date" => isset($video["age"]) ? strtotime($video["age"]) : null,
"duration" => isset($video["video"]["duration"]) ? $this->hms2int($video["video"]["duration"]) : null,
"views" => null,
"date" => isset($video["age"]) && $video["age"] != "void 0" ? strtotime($video["age"]) : null,
"duration" => isset($video["video"]["duration"]) && $video["video"]["duration"] != "void 0" ? $this->hms2int($video["video"]["duration"]) : null,
"views" => isset($video["video"]["views"]) && $video["video"]["views"] != "void 0" ? (int)$video["video"]["views"] : null,
"thumb" =>
isset($video["thumbnail"]["src"]) ?
[
@ -1008,29 +1048,27 @@ class brave{
public function news($get){
$search = $get["s"];
if(strlen($search) === 0){
if($get["npt"]){
throw new Exception("Search term is empty!");
}
[$req, $proxy] = $this->backend->get($get["npt"], "news");
$nsfw = $get["nsfw"];
$country = $get["country"];
$req = json_decode($req, true);
if(strlen($search) > 2048){
$search = $req["q"];
$country = $req["country"];
$nsfw = $req["nsfw"];
$offset = $req["offset"];
$spellcheck = $req["spellcheck"];
throw new Exception("Search query is too long!");
}
/*
$handle = fopen("scraper/brave-news.html", "r");
$html = fread($handle, filesize("scraper/brave-news.html"));
fclose($handle);*/
try{
$html =
$this->get(
$proxy,
"https://search.brave.com/news",
[
"q" => $search
"q" => $search,
"offset" => $offset,
"spellcheck" => $spellcheck
],
$nsfw,
$country
@ -1041,6 +1079,46 @@ class brave{
throw new Exception("Could not fetch search page");
}
}else{
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
if(strlen($search) > 2048){
throw new Exception("Search term is too long!");
}
$proxy = $this->backend->get_ip();
$nsfw = $get["nsfw"];
$country = $get["country"];
$spellcheck = $get["spellcheck"] == "yes" ? "1" : "0";
/*
$handle = fopen("scraper/brave-news.html", "r");
$html = fread($handle, filesize("scraper/brave-news.html"));
fclose($handle);*/
try{
$html =
$this->get(
$proxy,
"https://search.brave.com/news",
[
"q" => $search,
"spellcheck" => $spellcheck
],
$nsfw,
$country
);
}catch(Exception $error){
throw new Exception("Could not fetch search page");
}
}
$out = [
"status" => "ok",
"npt" => null,
@ -1050,6 +1128,17 @@ class brave{
// load html
$this->fuckhtml->load($html);
// get npt
$out["npt"] =
$this->generatenextpagetoken(
$search,
$nsfw,
$country,
$spellcheck,
"news",
$proxy
);
$news =
$this->fuckhtml
->getElementsByClassName(
@ -1183,8 +1272,19 @@ class brave{
public function image($get){
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
if(strlen($search) > 2048){
throw new Exception("Search term is too long!");
}
$country = $get["country"];
$nsfw = $get["nsfw"];
$spellcheck = $get["spellcheck"] == "yes" ? "1" : "0";
$out = [
"status" => "ok",
@ -1195,9 +1295,11 @@ class brave{
try{
$html =
$this->get(
$this->backend->get_ip(), // no nextpage right now, pass proxy directly
"https://search.brave.com/images",
[
"q" => $search
"q" => $search,
"spellcheck" => $spellcheck
],
$nsfw,
$country
@ -1261,9 +1363,75 @@ class brave{
public function video($get){
if($get["npt"]){
[$npt, $proxy] = $this->backend->get($get["npt"], "videos");
$npt = json_decode($npt, true);
$search = $npt["q"];
$offset = $npt["offset"];
$spellcheck = $npt["spellcheck"];
$country = $npt["country"];
$nsfw = $npt["nsfw"];
try{
$html =
$this->get(
$proxy,
"https://search.brave.com/videos",
[
"q" => $search,
"offset" => $offset,
"spellcheck" => $spellcheck
],
$nsfw,
$country
);
}catch(Exception $error){
throw new Exception("Could not fetch search page");
}
}else{
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
if(strlen($search) > 2048){
throw new Exception("Search term is too long!");
}
$country = $get["country"];
$nsfw = $get["nsfw"];
$spellcheck = $get["spellcheck"] == "yes" ? "1" : "0";
$proxy = $this->backend->get_ip();
try{
$html =
$this->get(
$proxy,
"https://search.brave.com/videos",
[
"q" => $search,
"spellcheck" => $spellcheck
],
$nsfw,
$country
);
}catch(Exception $error){
throw new Exception("Could not fetch search page");
}
}
$this->fuckhtml->load($html);
$out = [
"status" => "ok",
@ -1275,21 +1443,17 @@ class brave{
"reel" => []
];
try{
$html =
$this->get(
"https://search.brave.com/videos",
[
"q" => $search
],
// get npt
$out["npt"] =
$this->generatenextpagetoken(
$search,
$nsfw,
$country
$country,
$spellcheck,
"videos",
$proxy
);
}catch(Exception $error){
throw new Exception("Could not fetch search page");
}
/*
$handle = fopen("scraper/brave-video.html", "r");
$html = fread($handle, filesize("scraper/brave-video.html"));
@ -1606,7 +1770,7 @@ class brave{
$data["table"][trim($html[0])] = trim($html[1]);
}
}
/*
private function getimagelinkfromstyle($thumb){
$thumb =
@ -1646,13 +1810,13 @@ class brave{
"url" => $url,
"ratio" => "16:9"
];
}
}*/
private function limitstrlen($text){
return explode("\n", wordwrap($text, 300, "\n"))[0];
}
/*
private function limitwhitespace($text){
return
@ -1661,7 +1825,7 @@ class brave{
" ",
$text
);
}
}*/
private function titledots($title){
@ -1678,6 +1842,52 @@ class brave{
return trim($title);
}
private function generatenextpagetoken($q, $nsfw, $country, $spellcheck, $page, $proxy){
$nextpage =
$this->fuckhtml
->getElementsByClassName("btn", "a");
if(count($nextpage) !== 0){
$nextpage =
$nextpage[count($nextpage) - 1];
if(
strtolower(
$this->fuckhtml
->getTextContent(
$nextpage
)
) == "next"
){
preg_match(
'/offset=([0-9]+)/',
$this->fuckhtml->getTextContent($nextpage["attributes"]["href"]),
$nextpage
);
return
$this->backend->store(
json_encode(
[
"q" => $q,
"offset" => (int)$nextpage[1],
"nsfw" => $nsfw,
"country" => $country,
"spellcheck" => $spellcheck
]
),
$page,
$proxy
);
}
}
return null;
}
private function unshiturl($url){
// https://imgs.search.brave.com/XFnbR8Sl7ge82MBDEH7ju0UHImRovMVmQ2qnDvgNTuA/rs:fit:844:225:1/g:ce/aHR0cHM6Ly90c2U0/Lm1tLmJpbmcubmV0/L3RoP2lkPU9JUC54/UWotQXU5N2ozVndT/RDJnNG9BNVhnSGFF/SyZwaWQ9QXBp.jpeg

View file

@ -4,8 +4,11 @@ class ddg{
public function __construct(){
include "lib/nextpage.php";
$this->nextpage = new nextpage("ddg");
include "lib/backend.php";
$this->backend = new backend("ddg");
include "lib/fuckhtml.php";
$this->fuckhtml = new fuckhtml();
}
/*
@ -14,7 +17,7 @@ class ddg{
private const req_web = 0;
private const req_xhr = 1;
private function get($url, $get = [], $reqtype = self::req_web){
private function get($proxy, $url, $get = [], $reqtype = self::req_web){
$curlproc = curl_init();
@ -28,7 +31,7 @@ class ddg{
switch($reqtype){
case self::req_web:
$headers =
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0",
["User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Encoding: gzip",
"Accept-Language: en-US,en;q=0.5",
@ -43,7 +46,7 @@ class ddg{
case self::req_xhr:
$headers =
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0",
["User-Agent: " . config::USER_AGENT,
"Accept: */*",
"Accept-Encoding: gzip",
"Accept-Language: en-US,en;q=0.5",
@ -57,6 +60,8 @@ class ddg{
break;
}
$this->backend->assign_proxy($curlproc, $proxy);
curl_setopt($curlproc, CURLOPT_ENCODING, ""); // default encoding
curl_setopt($curlproc, CURLOPT_HTTPHEADER, $headers);
@ -69,7 +74,6 @@ class ddg{
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
throw new Exception(curl_error($curlproc));
}
@ -541,9 +545,11 @@ class ddg{
public function web($get){
$proxy = null;
if($get["npt"]){
$jsgrep = $this->nextpage->get($get["npt"], "web");
[$jsgrep, $proxy] = $this->backend->get($get["npt"], "web");
$extendedsearch = false;
$inithtml = "";
@ -555,6 +561,7 @@ class ddg{
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$country = $get["country"];
$nsfw = $get["nsfw"];
$older = $get["older"];
@ -614,9 +621,9 @@ class ddg{
/*
Get html
*/
// https://duckduckgo.com/?q=minecraft&kz=1&k1=-1&kp=-2
try{
$inithtml = $this->get(
$proxy,
"https://duckduckgo.com/",
$get_filters
);
@ -643,6 +650,7 @@ class ddg{
try{
$js = $this->get(
$proxy,
"https://links.duckduckgo.com" . $jsgrep,
[],
ddg::req_xhr
@ -692,6 +700,7 @@ class ddg{
// get definition
$wordnikjs = $this->get(
$proxy,
"https://duckduckgo.com/js/spice/dictionary/definition/" . $wordnik,
[],
ddg::req_xhr
@ -725,6 +734,7 @@ class ddg{
$wordnikaudio_json =
json_decode(
$this->get(
$proxy,
"https://duckduckgo.com/js/spice/dictionary/audio/" . $wordnik,
[],
ddg::req_xhr
@ -922,6 +932,7 @@ class ddg{
try{
$stackjs = $this->get(
$proxy,
"https://duckduckgo.com" . $stack,
[],
ddg::req_xhr
@ -944,7 +955,7 @@ class ddg{
$out["answer"][] = [
"title" => $stackjson["Heading"],
"description" => $this->htmltoarray($stackjson["Abstract"]),
"description" => $this->stackoverflow_parse($stackjson["Abstract"]),
"url" => str_replace(["http://", "ddg"], ["https://", ""], $stackjson["AbstractURL"]),
"thumb" => null,
"table" => [],
@ -973,6 +984,7 @@ class ddg{
try{
$lyricsjs = $this->get(
$proxy,
"https://duckduckgo.com" . $lyrics,
[],
ddg::req_xhr
@ -1166,13 +1178,13 @@ class ddg{
if(isset($answers[$i]["data"]["AbstractText"]) && !empty($answers[$i]["data"]["AbstractText"])){
$description = $this->htmltoarray($answers[$i]["data"]["AbstractText"]);
$description = $this->stackoverflow_parse($answers[$i]["data"]["AbstractText"]);
}elseif(isset($answers[$i]["data"]["Abstract"]) && !empty($answers[$i]["data"]["Abstract"])){
$description = $this->htmltoarray($answers[$i]["data"]["Abstract"]);
$description = $this->stackoverflow_parse($answers[$i]["data"]["Abstract"]);
}elseif(isset($answers[$i]["data"]["Answer"]) && !empty($answers[$i]["data"]["Answer"])){
$description = $this->htmltoarray($answers[$i]["data"]["Answer"]);
$description = $this->stackoverflow_parse($answers[$i]["data"]["Answer"]);
}else{
$description = [];
@ -1310,6 +1322,7 @@ class ddg{
$description = [];
$shitcoinjs = $this->get(
$proxy,
"https://duckduckgo.com/js/spice/cryptocurrency/{$shitcoins[1]}/{$shitcoins[2]}/1",
[],
ddg::req_xhr
@ -1408,6 +1421,7 @@ class ddg{
try{
$currencyjs = $this->get(
$proxy,
"https://duckduckgo.com/js/spice/currency/{$amount}/" . strtolower($currencies[1]) . "/" . strtolower($currencies[2]),
[],
ddg::req_xhr
@ -1607,7 +1621,7 @@ class ddg{
// store next page token
if(isset($web[$i]["n"])){
$out["npt"] = $this->nextpage->store($web[$i]["n"] . "&biaexp=b&eslexp=a&litexp=c&msvrtexp=b&wrap=1", "web");
$out["npt"] = $this->backend->store($web[$i]["n"] . "&biaexp=b&eslexp=a&litexp=c&msvrtexp=b&wrap=1", "web", $proxy);
continue;
}
@ -1874,10 +1888,11 @@ class ddg{
if($get["npt"]){
$npt = $this->nextpage->get($get["npt"], "images");
[$npt, $proxy] = $this->backend->get($get["npt"], "images");
try{
$json = json_decode($this->get(
$proxy,
"https://duckduckgo.com/i.js?" . $npt,
[],
ddg::req_xhr
@ -1895,6 +1910,7 @@ class ddg{
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$country = $get["country"];
$nsfw = $get["nsfw"];
$date = $get["date"];
@ -1934,6 +1950,7 @@ class ddg{
try{
$html = $this->get(
$proxy,
"https://duckduckgo.com",
$get_filters,
ddg::req_web
@ -1980,6 +1997,7 @@ class ddg{
try{
$json = json_decode($this->get(
$proxy,
"https://duckduckgo.com/i.js",
$js_params,
ddg::req_xhr
@ -2005,10 +2023,11 @@ class ddg{
}
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
explode("?", $json["next"])[1] . "&vqd=" .
$vqd,
"images"
"images",
$proxy
);
}
@ -2046,10 +2065,11 @@ class ddg{
if($get["npt"]){
$npt = $this->nextpage->get($get["npt"], "videos");
[$npt, $proxy] = $this->backend->get($get["npt"], "videos");
try{
$json = json_decode($this->get(
$proxy,
"https://duckduckgo.com/v.js?" .
$npt,
[],
@ -2068,6 +2088,7 @@ class ddg{
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$country = $get["country"];
$nsfw = $get["nsfw"];
$date = $get["date"];
@ -2099,6 +2120,7 @@ class ddg{
try{
$html = $this->get(
$proxy,
"https://duckduckgo.com",
$get_filters,
ddg::req_web
@ -2123,6 +2145,7 @@ class ddg{
try{
$json = json_decode($this->get(
$proxy,
"https://duckduckgo.com/v.js",
[
"l" => "us-en",
@ -2155,9 +2178,10 @@ class ddg{
if(isset($json["next"])){
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
explode("?", $json["next"])[1],
"videos"
"videos",
$proxy
);
}
@ -2213,11 +2237,12 @@ class ddg{
if($get["npt"]){
$req = $this->nextpage->get($get["npt"], "news");
[$req, $proxy] = $this->backend->get($get["npt"], "news");
try{
$json = json_decode($this->get(
$proxy,
"https://duckduckgo.com/news.js?" .
$req,
[],
@ -2236,6 +2261,7 @@ class ddg{
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$country = $get["country"];
$nsfw = $get["nsfw"];
$date = $get["date"];
@ -2261,6 +2287,7 @@ class ddg{
try{
$html = $this->get(
$proxy,
"https://duckduckgo.com",
$get_params,
ddg::req_web
@ -2303,6 +2330,7 @@ class ddg{
}
$json = json_decode($this->get(
$proxy,
"https://duckduckgo.com/news.js",
$js_params,
ddg::req_xhr
@ -2323,9 +2351,10 @@ class ddg{
if(isset($json["next"])){
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
explode("?", $json["next"])[1],
"news"
"news",
$proxy
);
}
@ -2415,192 +2444,193 @@ class ddg{
return "https://" . $parse["host"] . "/th?id=" . urlencode($parts["id"]);
}
private function htmltoarray($html){
private function appendtext($payload, &$text, &$index){
$html = strip_tags($html, ["img", "pre", "code", "br", "h1", "h2", "h3", "h4", "h5", "h6", "blockquote", "a"]);
if(trim($payload) == ""){
libxml_use_internal_errors(true);
$dom = new DOMDocument("1.0", "utf-8");
$dom->loadHTML('<div>' . $html . '</div>');
$xpath = new DOMXPath($dom);
$descendants = $xpath->query('//div/node()');
return;
}
$images = $xpath->query('//div/node()/img');
$imageiterator = 0;
if(
$index !== 0 &&
$text[$index - 1]["type"] == "text"
){
if(count($descendants) === 0){
$text[$index - 1]["value"] .= preg_replace('/ $/', " ", $payload);
}else{
$text[] = [
"type" => "text",
"value" => preg_replace('/ $/', " ", $payload)
];
$index++;
}
}
private function stackoverflow_parse($html){
$i = 0;
$answer = [];
$this->fuckhtml->load($html);
$tags = $this->fuckhtml->getElementsByTagName("*");
if(count($tags) === 0){
return [
[
"type" => "text",
"value" => $this->unescapehtml($html)
"value" => htmlspecialchars_decode($html)
]
];
}
$array = [];
$previoustype = null;
foreach($tags as $snippet){
foreach($descendants as $node){
switch($snippet["tagName"]){
// $node->nodeValue = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $node->nodeValue);
case "p":
$this->fuckhtml->load($snippet["innerHTML"]);
$codetags =
$this->fuckhtml
->getElementsByTagName("*");
$tmphtml = $snippet["innerHTML"];
foreach($codetags as $tag){
if(!isset($tag["outerHTML"])){
continue;
}
$tmphtml =
explode(
$tag["outerHTML"],
$tmphtml,
2
);
$value = $this->fuckhtml->getTextContent($tmphtml[0], false, false);
$this->appendtext($value, $answer, $i);
$type = null;
switch($tag["tagName"]){
case "code": $type = "inline_code"; break;
case "em": $type = "italic"; break;
case "blockquote": $type = "quote"; break;
default: $type = "text";
}
if($type !== null){
$value = $this->fuckhtml->getTextContent($tag, false, false);
if(trim($value) != ""){
$answer[] = [
"type" => $type,
"value" => rtrim($value)
];
$i++;
}
}
if(count($tmphtml) === 2){
$tmphtml = $tmphtml[1] . "\n";
}else{
// get node type
switch($node->nodeName){
case "#text":
$type = "text";
break;
}
}
case "pre":
$type = "code";
break;
if(is_array($tmphtml)){
case "code":
$type = "inline_code";
break;
$tmphtml = $tmphtml[0];
}
case "h1":
case "h2":
case "h3":
case "h4":
case "h5":
case "h6":
$type = "title";
break;
if(strlen($tmphtml) !== 0){
case "blockquote":
$type = "quote";
break;
case "a":
$type = "link";
$value = $this->fuckhtml->getTextContent($tmphtml, true, false);
$this->appendtext($value, $answer, $i);
}
break;
case "img":
$type = "image";
$answer[] = [
"type" => "image",
"url" =>
$this->fuckhtml
->getTextContent(
$tag["attributes"]["src"]
)
];
$i++;
break;
}
// add node to array
switch($type){
case "pre":
switch($answer[$i - 1]["type"]){
case "text":
$value = preg_replace(
'/ {2,}/',
" ",
$this->limitnewlines($this->unescapehtml($node->textContent))
case "italic":
$answer[$i - 1]["value"] = rtrim($answer[$i - 1]["value"]);
break;
}
$answer[] =
[
"type" => "code",
"value" =>
rtrim(
$this->fuckhtml
->getTextContent(
$snippet,
true,
false
)
)
];
$i++;
break;
case "ol":
$o = 0;
$this->fuckhtml->load($snippet);
$li =
$this->fuckhtml
->getElementsByTagName("li");
foreach($li as $elem){
$o++;
$this->appendtext(
$o . ". " .
$this->fuckhtml
->getTextContent(
$elem
),
$answer,
$i
);
}
break;
}
}
if(
$previoustype == "quote" ||
$previoustype === null ||
$previoustype == "image" ||
$previoustype == "title" ||
$previoustype == "code"
$i !== 0 &&
$answer[$i - 1]["type"] == "text"
){
$value = ltrim($value);
$answer[$i - 1]["value"] = rtrim($answer[$i - 1]["value"]);
}
if($value == ""){
$previoustype = $type;
continue 2;
}
// merge with previous text node
if($previoustype == "text"){
$array[count($array) - 1]["value"] = trim($array[count($array) - 1]["value"]) . "\n" . $this->bstoutf8($value);
}else{
$array[] = [
"type" => "text",
"value" => $this->bstoutf8($value)
];
}
break;
case "inline_code":
case "bold":
$array[] = [
"type" => "inline_code",
"value" => $this->bstoutf8(trim($this->limitnewlines($this->unescapehtml($node->textContent))))
];
break;
case "link":
// check for link nested inside of image
if(strlen($node->childNodes->item(0)->textContent) !== 0){
$array[] = [
"type" => "link",
"value" => $this->bstoutf8(trim($this->unescapehtml($node->textContent))),
"url" => $this->bstoutf8(preg_replace('/\/ddg$/', "", preg_replace('/^http:\/\//', "https://", $this->sanitizeurl($node->getAttribute("href")))))
];
break;
}
$type = "image";
if($previoustype == "text"){
$array[count($array) - 1]["value"] = rtrim($array[count($array) - 1]["value"]);
}
$array[] = [
"type" => "image",
"url" => $this->bstoutf8(preg_replace('/^http:\/\//', "https://", preg_replace('/^\/\/images\.duckduckgo\.com\/iu\/\?u=/', "", $images->item($imageiterator)->getAttribute("src"))))
];
$imageiterator++;
break;
case "image":
if($previoustype == "text"){
$array[count($array) - 1]["value"] = rtrim($array[count($array) - 1]["value"]);
}
$array[] = [
"type" => "image",
"url" => $this->bstoutf8(preg_replace('/^http:\/\//', "https://", preg_replace('/^\/\/images\.duckduckgo\.com\/iu\/\?u=/', "", $node->getAttribute("src"))))
];
break;
case "quote":
case "title":
case "code":
if($previoustype == "text"){
$array[count($array) - 1]["value"] = rtrim($array[count($array) - 1]["value"]);
}
// no break
default:
$value = trim($this->limitnewlines($this->unescapehtml($node->textContent)));
if($type != "code"){
$value = preg_replace(
'/ {2,}/',
" ",
$value
);
}
$array[] = [
"type" => $type,
"value" => $this->bstoutf8($value)
];
break;
}
$previoustype = $type;
}
return $array;
return $answer;
}
private function bstoutf8($bs){

View file

@ -9,6 +9,9 @@ class facebook{
include "lib/nextpage.php";
$this->nextpage = new nextpage("fb");
include "lib/proxy_pool.php";
$this->proxy = new proxy_pool("facebook");
}
public function getfilters($page){
@ -105,6 +108,8 @@ class facebook{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->proxy->assign_proxy($curlproc);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){

View file

@ -4,8 +4,8 @@ class ftm{
public function __construct(){
include "lib/nextpage.php";
$this->nextpage = new nextpage("ftm");
include "lib/backend.php";
$this->backend = new backend("ftm");
}
public function getfilters($page){
@ -13,7 +13,7 @@ class ftm{
return [];
}
private function get($url, $search, $offset){
private function get($proxy, $url, $search, $offset){
$curlproc = curl_init();
@ -29,7 +29,7 @@ class ftm{
curl_setopt($curlproc, CURLOPT_ENCODING, ""); // default encoding
curl_setopt($curlproc, CURLOPT_HTTPHEADER,
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
["User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -57,6 +57,8 @@ class ftm{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -70,8 +72,6 @@ class ftm{
public function image($get){
$search = $get["s"];
$out = [
"status" => "ok",
"npt" => null,
@ -80,16 +80,28 @@ class ftm{
if($get["npt"]){
$count = (int)$this->nextpage->get($get["npt"], "images");
[$data, $proxy] = $this->backend->get($get["npt"], "images");
$data = json_decode($data, true);
$count = $data["count"];
$search = $data["search"];
}else{
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
$count = 0;
$proxy = $this->backend->get_ip();
}
try{
$json =
json_decode(
$this->get(
$proxy,
"https://findthatmeme.com/api/v1/search",
$search,
$count
@ -134,14 +146,15 @@ class ftm{
];
}
if($count === 50){
$out["npt"] =
$this->nextpage->store(
$count,
"images"
$this->backend->store(
json_encode([
"count" => $count,
"search" => $search
]),
"images",
$proxy
);
}
return $out;
}

View file

@ -10,8 +10,8 @@ class google{
include "lib/fuckhtml.php";
$this->fuckhtml = new fuckhtml();
include "lib/nextpage.php";
$this->nextpage = new nextpage("google");
include "lib/backend.php";
$this->backend = new backend("google");
}
public function getfilters($page){
@ -727,7 +727,7 @@ class google{
}
}
private function get($url, $get = []){
private function get($proxy, $url, $get = []){
$headers = [
"User-Agent: Mozilla/5.0 (Linux; U; Android 2.3.3; pt-pt; LG-P500h-parrot Build/GRI40) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 MMS/LG-Android-MMS-V1.0/1.2",
@ -761,6 +761,8 @@ class google{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -771,7 +773,7 @@ class google{
curl_close($curlproc);
return $data;
}
/*
public function web($get){
$search = $get["s"];
@ -877,9 +879,9 @@ class google{
if(count($title) !== 0){
/*
Container is a web link
*/
//
// Container is a web link
//
$web = [
"title" =>
$this->titledots(
@ -1051,9 +1053,9 @@ class google{
continue;
}
/*
Parse rating object
*/
//
// Parse rating object
//
if($is_rating >= -1){
@ -1102,9 +1104,9 @@ class google{
continue;
}
/*
Parse standalone text
*/
//
// Parse standalone text
//
$additional_info[] = $innertext;
}
}
@ -1194,9 +1196,9 @@ class google{
$container_title == "people also search for"
){
/*
Parse related searches
*/
//
// Parse related searches
//
$as =
$this->fuckhtml
->getElementsByTagName("a");
@ -1212,9 +1214,9 @@ class google{
continue;
}
/*
Parse image carousel
*/
//
// Parse image carousel
//
$title_container =
$this->fuckhtml
->getElementsByClassName(
@ -1239,9 +1241,9 @@ class google{
if($title_container == "imagesview all"){
/*
Image carousel
*/
//
// Image carousel
//
$pcitem =
$this->fuckhtml
->getElementsByClassName(
@ -1316,9 +1318,9 @@ class google{
}
}
/*
Get next page
*/
//
// Get next page
//
$as =
$this->fuckhtml
->getElementsByTagName("a");
@ -1340,7 +1342,7 @@ class google{
}
return $out;
}
}*/
public function image($get){
@ -1348,17 +1350,22 @@ class google{
// generate parameters
if($get["npt"]){
$params =
json_decode(
$this->nextpage->get(
[$params, $proxy] =
$this->backend->get(
$get["npt"],
"images"
),
true
);
$params = json_decode($params, true);
}else{
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$country = $get["country"];
$nsfw = $get["nsfw"];
$lang = $get["lang"];
@ -1475,6 +1482,7 @@ class google{
try{
$html =
$this->get(
$proxy,
"https://www.google.com/search",
$params
);
@ -1578,9 +1586,10 @@ class google{
$params["ijn"] = (int)$params["ijn"] + 1;
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
json_encode($params),
"images"
"images",
$proxy
);
}else{
@ -1628,9 +1637,10 @@ class google{
$params["imgvl"] = $imgvl;
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
json_encode($params),
"images"
"images",
$proxy
);
}
}

View file

@ -4,11 +4,11 @@ class imgur{
public function __construct(){
include "lib/nextpage.php";
$this->nextpage = new nextpage("imgur");
include "lib/fuckhtml.php";
$this->fuckhtml = new fuckhtml();
include "lib/backend.php";
$this->backend = new backend("imgur");
}
public function getfilters($page){
@ -57,7 +57,7 @@ class imgur{
];
}
private function get($url, $get = []){
private function get($proxy, $url, $get = []){
$curlproc = curl_init();
@ -70,7 +70,7 @@ class imgur{
curl_setopt($curlproc, CURLOPT_ENCODING, ""); // default encoding
curl_setopt($curlproc, CURLOPT_HTTPHEADER,
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
["User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -90,6 +90,8 @@ class imgur{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -105,15 +107,14 @@ class imgur{
if($get["npt"]){
$filter =
json_decode(
$this->nextpage->get(
[$filter, $proxy] =
$this->backend->get(
$get["npt"],
"images"
),
true
);
$filter = json_decode($filter, true);
$search = $filter["s"];
unset($filter["s"]);
@ -134,6 +135,12 @@ class imgur{
}else{
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$sort = $get["sort"];
$time = $get["time"];
$format = $get["format"];
@ -165,6 +172,7 @@ class imgur{
try{
$html =
$this->get(
$proxy,
"https://imgur.com/search/$sort/$time/page/$page",
$filter
);
@ -238,9 +246,10 @@ class imgur{
$filter["page"] = $page + 1;
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
json_encode($filter),
"images"
"images",
$proxy
);
}

View file

@ -3,7 +3,8 @@
class marginalia{
public function __construct(){
$this->key = "public";
include "lib/backend.php";
$this->backend = new backend("marginalia");
}
public function getfilters($page){
@ -76,10 +77,10 @@ class marginalia{
}
}
private function get($url, $get = []){
private function get($proxy, $url, $get = []){
$headers = [
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
"User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -110,6 +111,8 @@ class marginalia{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -124,6 +127,11 @@ class marginalia{
public function web($get){
$search = [$get["s"]];
if(strlen($get["s"]) === 0){
throw new Exception("Search term is empty!");
}
$profile = $get["profile"];
$format = $get["format"];
$file = $get["file"];
@ -184,7 +192,8 @@ class marginalia{
try{
$json =
$this->get(
"https://api.marginalia.nu/{$this->key}/search/" . urlencode($search),
$this->backend->get_ip(), // no nextpage
"https://api.marginalia.nu/" . config::MARGINALIA_API_KEY . "/search/" . urlencode($search),
$params
);
}catch(Exception $error){

View file

@ -6,8 +6,8 @@ class mojeek{
include "lib/fuckhtml.php";
$this->fuckhtml = new fuckhtml();
include "lib/nextpage.php";
$this->nextpage = new nextpage("mojeek");
include "lib/backend.php";
$this->backend = new backend("mojeek");
}
public function getfilters($page){
@ -371,10 +371,10 @@ class mojeek{
}
}
private function get($url, $get = []){
private function get($proxy, $url, $get = []){
$headers = [
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
"User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -405,6 +405,8 @@ class mojeek{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -420,11 +422,12 @@ class mojeek{
if($get["npt"]){
$token = $this->nextpage->get($get["npt"], "web");
[$token, $proxy] = $this->backend->get($get["npt"], "web");
try{
$html =
$this->get(
$proxy,
"https://www.mojeek.com" . $token,
[]
);
@ -485,9 +488,12 @@ class mojeek{
$params["si"] = $domain;
}
$proxy = $this->backend->get_ip();
try{
$html =
$this->get(
$proxy,
"https://www.mojeek.com/search",
$params
);
@ -529,11 +535,12 @@ class mojeek{
return $out;
}
$this->fuckhtml->load($results[0]);
/*
Get search results
Get all search result divs
*/
foreach($results as $container){
$this->fuckhtml->load($container);
$results =
$this->fuckhtml
->getElementsByTagName("li");
@ -612,6 +619,7 @@ class mojeek{
$out["web"][] = $data;
}
}
/*
Get instant answers
@ -969,12 +977,13 @@ class mojeek{
if($a["innerHTML"] == "Next"){
$out["npt"] = $this->nextpage->store(
$out["npt"] = $this->backend->store(
$this->fuckhtml
->getTextContent(
$a["attributes"]["href"]
),
"web"
"web",
$proxy
);
}
}
@ -1001,6 +1010,7 @@ class mojeek{
try{
$html =
$this->get(
$this->backend->get_ip(),
"https://www.mojeek.com/search",
[
"q" => $search,
@ -1011,41 +1021,20 @@ class mojeek{
throw new Exception("Failed to get HTML");
}
/*
$handle = fopen("scraper/mojeek.html", "r");
$html = fread($handle, filesize("scraper/mojeek.html"));
fclose($handle);*/
/*
Get big, standard and smaller nodes
fclose($handle);
*/
foreach(
[
"results-extended",
"results-standard"
]
as $categoryname
){
$this->fuckhtml->load($html);
$categories =
$this->fuckhtml
->getElementsByClassName(
$categoryname,
"ul"
);
$articles =
$this->fuckhtml->getElementsByTagName("article");
foreach($categories as $category){
foreach($articles as $article){
$this->fuckhtml->load($category);
$nodes =
$this->fuckhtml
->getElementsByTagName("li");
foreach($nodes as $node){
$this->fuckhtml->load($article);
$data = [
"title" => null,
@ -1060,15 +1049,7 @@ class mojeek{
"url" => null
];
/*
Parse the results
*/
$this->fuckhtml->load($node);
// get title + url
$a =
$this->fuckhtml
->getElementsByTagName("a")[0];
$a = $this->fuckhtml->getElementsByTagName("a")[0];
$data["title"] =
$this->fuckhtml
@ -1082,65 +1063,53 @@ class mojeek{
$a["attributes"]["href"]
);
// get image
$image =
$this->fuckhtml
->getElementsByTagName("img");
if(count($image) !== 0){
$data["thumb"] = [
"url" =>
urldecode(
str_replace(
"/image?img=",
"",
$this->fuckhtml
->getTextContent(
$image[0]["attributes"]["src"]
)
)
),
"ratio" => "16:9"
];
}
// get description
$description =
$this->fuckhtml
->getElementsByClassName("s", "p");
if(count($description) !== 0){
$p = $this->fuckhtml->getElementsByTagName("p");
$data["description"] =
$this->titledots(
$this->fuckhtml
->getTextContent(
$description[0]
$this->fuckhtml
->getElementsByClassName(
"s",
$p
)[0]
)
);
if($data["description"] == ""){
$data["description"] = null;
}
// get date + time
// get date from big node
$date =
$this->fuckhtml
->getElementsByClassName(
"date",
"p"
$p
);
$i =
$this->fuckhtml
->getElementsByClassName("i", "p");
if(count($date) !== 0){
// we're inside a big node
$data["date"] = strtotime($date[0]["innerHTML"]);
$data["date"] =
strtotime(
$this->fuckhtml
->getTextContent(
$date[0]
)
);
}
if(count($i) !== 0){
// grep date + author
$s =
$this->fuckhtml
->getElementsByClassName(
"i",
$p
)[0];
$this->fuckhtml->load($i[0]);
$this->fuckhtml->load($s);
$a =
$this->fuckhtml
@ -1148,32 +1117,44 @@ class mojeek{
if(count($a) !== 0){
// parse big node information
$data["author"] =
$this->fuckhtml
->getTextContent($a[0]);
}
}
->getTextContent(
$a[0]["innerHTML"]
);
}else{
// we're inside a small node
if(count($i) !== 0){
$i =
explode(
" - ",
// parse smaller nodes
$replace =
$this->fuckhtml
->getTextContent($i[0])
->getElementsByTagName("time")[0];
$data["date"] =
strtotime(
$this->fuckhtml
->getTextContent(
$replace
)
);
$data["date"] = strtotime(array_pop($i));
$data["author"] = implode(" - ", $i);
}
$s["innerHTML"] =
str_replace(
$replace["outerHTML"],
"",
$s["innerHTML"]
);
$data["author"] =
preg_replace(
'/ &bull; $/',
"",
$s["innerHTML"]
);
}
$out["news"][] = $data;
}
}
}
return $out;
}

View file

@ -6,6 +6,9 @@ class pinterest{
include "lib/nextpage.php";
$this->nextpage = new nextpage("pinterest");
include "lib/proxy_pool.php";
$this->proxy = new proxy_pool("pinterest");
}
public function getfilters($page){
@ -45,6 +48,8 @@ class pinterest{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->proxy->assign_proxy($curlproc);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){

View file

@ -4,10 +4,8 @@ class sc{
public function __construct(){
include "lib/nextpage.php";
$this->nextpage = new nextpage("sc");
$this->client_id = "ArYppSEotE3YiXCO4Nsgid2LLqJutiww";
$this->user_id = "766585-580597-163310-929698";
include "lib/backend.php";
$this->backend = new backend("sc");
}
public function getfilters($page){
@ -27,7 +25,7 @@ class sc{
];
}
private function get($url, $get = []){
private function get($proxy, $url, $get = []){
$curlproc = curl_init();
@ -40,7 +38,7 @@ class sc{
curl_setopt($curlproc, CURLOPT_ENCODING, ""); // default encoding
curl_setopt($curlproc, CURLOPT_HTTPHEADER,
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/116.0",
["User-Agent: " . config::USER_AGENT,
"Accept: application/json, text/javascript, */*; q=0.01",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -59,6 +57,8 @@ class sc{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -74,7 +74,7 @@ class sc{
if($get["npt"]){
$params = $this->nextpage->get($get["npt"], "music");
[$params, $proxy] = $this->backend->get($get["npt"], "music");
$params = json_decode($params, true);
$url = $params["url"];
@ -101,7 +101,13 @@ class sc{
// https://api-v2.soundcloud.com/search/playlists_without_albums?q=freddie%20dredd&variant_ids=&facet=genre&user_id=630591-269800-703400-765403&client_id=iMxZgT5mfGstBj8GWJbYMvpzelS8ne0E&limit=20&offset=0&linked_partitioning=1&app_version=1693487844&app_locale=en
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
$type = $get["type"];
$proxy = $this->backend->get_ip();
switch($type){
@ -111,8 +117,8 @@ class sc{
"q" => $search,
"variant_ids" => "",
"facet" => "model",
"user_id" => $this->user_id,
"client_id" => $this->client_id,
"user_id" => config::SC_USER_ID,
"client_id" => config::SC_CLIENT_TOKEN,
"limit" => 20,
"offset" => 0,
"linked_partitioning" => 1,
@ -127,8 +133,8 @@ class sc{
"q" => $search,
"variant_ids" => "",
"facet_genre" => "",
"user_id" => $this->user_id,
"client_id" => $this->client_id,
"user_id" => config::SC_USER_ID,
"client_id" => config::SC_CLIENT_TOKEN,
"limit" => 20,
"offset" => 0,
"linked_partitioning" => 1,
@ -143,8 +149,8 @@ class sc{
"q" => $search,
"variant_ids" => "",
"facet" => "place",
"user_id" => $this->user_id,
"client_id" => $this->client_id,
"user_id" => config::SC_USER_ID,
"client_id" => config::SC_CLIENT_TOKEN,
"limit" => 20,
"offset" => 0,
"linked_partitioning" => 1,
@ -159,8 +165,8 @@ class sc{
"q" => $search,
"variant_ids" => "",
"facet" => "genre",
"user_id" => $this->user_id,
"client_id" => $this->client_id,
"user_id" => config::SC_USER_ID,
"client_id" => config::SC_CLIENT_TOKEN,
"limit" => 20,
"offset" => 0,
"linked_partitioning" => 1,
@ -175,8 +181,8 @@ class sc{
"q" => $search,
"variant_ids" => "",
"facet" => "genre",
"user_id" => $this->user_id,
"client_id" => $this->client_id,
"user_id" => config::SC_USER_ID,
"client_id" => config::SC_CLIENT_TOKEN,
"limit" => 20,
"offset" => 0,
"linked_partitioning" => 1,
@ -192,8 +198,8 @@ class sc{
"variant_ids" => "",
"filter.content_tier" => "SUB_HIGH_TIER",
"facet" => "genre",
"user_id" => $this->user_id,
"client_id" => $this->client_id,
"user_id" => config::SC_USER_ID,
"client_id" => config::SC_CLIENT_TOKEN,
"limit" => 20,
"offset" => 0,
"linked_partitioning" => 1,
@ -206,7 +212,7 @@ class sc{
try{
$json = $this->get($url, $params);
$json = $this->get($proxy, $url, $params);
}catch(Exception $error){
@ -244,9 +250,10 @@ class sc{
$params["url"] = $url; // we will remove this later
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
json_encode($params),
"music"
"music",
$proxy
);
}
@ -342,7 +349,7 @@ class sc{
"endpoint" => "audio_sc",
"url" =>
$item["media"]["transcodings"][0]["url"] .
"?client_id=" . $this->client_id .
"?client_id=" . config::SC_CLIENT_TOKEN .
"&track_authorization=" .
$item["track_authorization"]
];

View file

@ -4,8 +4,8 @@ class wiby{
public function __construct(){
include "lib/nextpage.php";
$this->nextpage = new nextpage("wiby");
include "lib/backend.php";
$this->backend = new backend("wiby");
}
public function getfilters($page){
@ -36,7 +36,7 @@ class wiby{
];
}
private function get($url, $get = [], $nsfw){
private function get($proxy, $url, $get = [], $nsfw){
$curlproc = curl_init();
@ -45,11 +45,13 @@ class wiby{
$url .= "?" . $get;
}
print_r([$proxy, $url]);
curl_setopt($curlproc, CURLOPT_URL, $url);
curl_setopt($curlproc, CURLOPT_ENCODING, ""); // default encoding
curl_setopt($curlproc, CURLOPT_HTTPHEADER,
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
["User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -69,6 +71,8 @@ class wiby{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -84,11 +88,8 @@ class wiby{
if($get["npt"]){
$q =
json_decode(
$this->nextpage->get($get["npt"], "web"),
true
);
[$q, $proxy] = $this->backend->get($get["npt"], "web");
$q = json_decode($q, true);
$nsfw = $q["nsfw"];
unset($q["nsfw"]);
@ -100,6 +101,7 @@ class wiby{
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$date = $get["date"];
$nsfw = $get["nsfw"] == "yes" ? "0" : "1";
@ -150,6 +152,7 @@ class wiby{
try{
$html = $this->get(
$proxy,
"https://wiby.me/",
$q,
$nsfw
@ -171,13 +174,14 @@ class wiby{
}else{
$nextpage =
$this->nextpage->store(
$this->backend->store(
json_encode([
"q" => $q["q"],
"p" => (int)$nextpage[1],
"nsfw" => $nsfw
]),
"web"
"web",
$proxy
);
}

View file

@ -10,11 +10,11 @@ class yandex{
include "lib/fuckhtml.php";
$this->fuckhtml = new fuckhtml();
include "lib/nextpage.php";
$this->nextpage = new nextpage("yandex");
include "lib/backend.php";
// backend included in the scraper functions
}
private function get($url, $get = [], $nsfw){
private function get($proxy, $url, $get = [], $nsfw){
$curlproc = curl_init();
@ -32,7 +32,7 @@ class yandex{
}
$headers =
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0",
["User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Encoding: gzip",
"Accept-Language: en-US,en;q=0.5",
@ -55,6 +55,8 @@ class yandex{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -207,6 +209,8 @@ class yandex{
public function web($get){
$this->backend = new backend("yandex_w");
// has captcha
// https://yandex.com/search/touch/?text=lol&app_platform=android&appsearch_header=1&ui=webmobileapp.yandex&app_version=23070603&app_id=ru.yandex.searchplugin&search_source=yandexcom_touch_native&clid=2218567
@ -215,10 +219,11 @@ class yandex{
if($get["npt"]){
$npt = $this->nextpage->get($get["npt"], "web");
[$npt, $proxy] = $this->backend->get($get["npt"], "web");
$html =
$this->get(
$proxy,
"https://yandex.com" . $npt,
[],
"yes"
@ -226,6 +231,12 @@ class yandex{
}else{
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$lang = $get["lang"];
$older = $get["older"];
$newer = $get["newer"];
@ -269,6 +280,7 @@ class yandex{
try{
$html =
$this->get(
$proxy,
"https://yandex.com/search/site/",
$params,
"yes"
@ -313,7 +325,7 @@ class yandex{
if(count($npt) !== 0){
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
$this->fuckhtml
->getTextContent(
$npt
@ -321,7 +333,8 @@ class yandex{
["attributes"]
["href"]
),
"web"
"web",
$proxy
);
}
@ -386,17 +399,18 @@ class yandex{
public function image($get){
$this->backend = new backend("yandex_i");
if($get["npt"]){
$request =
json_decode(
$this->nextpage->get(
[$request, $proxy] =
$this->backend->get(
$get["npt"],
"images"
),
true
);
$request = json_decode($request, true);
$nsfw = $request["nsfw"];
unset($request["nsfw"]);
}else{
@ -407,6 +421,7 @@ class yandex{
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$nsfw = $get["nsfw"];
$time = $get["time"];
$size = $get["size"];
@ -611,9 +626,11 @@ class yandex{
try{
$json = $this->get(
$proxy,
"https://yandex.com/images/search",
$request,
$nsfw
$nsfw,
"yandex_i"
);
}catch(Exception $err){
@ -676,7 +693,12 @@ class yandex{
$request["p"] = 1;
}
$out["npt"] = $this->nextpage->store(json_encode($request), "images");
$out["npt"] =
$this->backend->store(
json_encode($request),
"images",
$proxy
);
}
// get search results
@ -744,21 +766,29 @@ class yandex{
public function video($get){
$this->backend = new backend("yandex_v");
if($get["npt"]){
$params =
json_decode(
$this->nextpage->get(
[$params, $proxy] =
$this->backend->get(
$get["npt"],
"web"
),
true
"video"
);
$params = json_decode($params, true);
$nsfw = $params["nsfw"];
unset($params["nsfw"]);
}else{
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$nsfw = $get["nsfw"];
$time = $get["time"];
$duration = $get["duration"];
@ -865,9 +895,11 @@ class yandex{
try{
$json =
$this->get(
$proxy,
"https://yandex.com/video/search",
$params,
$nsfw
$nsfw,
"yandex_v"
);
}catch(Exception $error){
@ -926,9 +958,10 @@ class yandex{
$params["p"] = "1";
$params["nsfw"] = $nsfw;
$out["npt"] =
$this->nextpage->store(
$this->backend->store(
json_encode($params),
"web"
"video",
$proxy
);
}

View file

@ -4,8 +4,8 @@ class yep{
public function __construct(){
include "lib/nextpage.php";
$this->nextpage = new nextpage("yep");
include "lib/backend.php";
$this->backend = new backend("yep");
}
public function getfilters($page){
@ -238,7 +238,7 @@ class yep{
];
}
private function get($url, $get = []){
private function get($proxy, $url, $get = []){
$curlproc = curl_init();
@ -251,7 +251,7 @@ class yep{
curl_setopt($curlproc, CURLOPT_ENCODING, ""); // default encoding
curl_setopt($curlproc, CURLOPT_HTTPHEADER,
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
["User-Agent: " . config::USER_AGENT,
"Accept: */*",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -270,6 +270,8 @@ class yep{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -284,6 +286,11 @@ class yep{
public function image($get){
$search = $get["s"];
if(strlen($search) === 0){
throw new Exception("Search term is empty!");
}
$country = $get["country"];
$nsfw = $get["nsfw"];
@ -305,6 +312,7 @@ class yep{
$json =
json_decode(
$this->get(
$this->backend->get_ip(), // no nextpage!
"https://api.yep.com/fs/2/search",
[
"client" => "web",

View file

@ -8,8 +8,8 @@ class youtube{
public function __construct(){
include "lib/nextpage.php";
$this->nextpage = new nextpage("yt");
include "lib/backend.php";
$this->backend = new backend("yt");
}
public function getfilters($page){
@ -340,7 +340,7 @@ class youtube{
const req_web = 0;
const req_xhr = 1;
private function get($url, $get = [], $reqtype = self::req_web, $continuation = null){
private function get($proxy, $url, $get = [], $reqtype = self::req_web, $continuation = null){
$curlproc = curl_init();
@ -354,7 +354,7 @@ class youtube{
switch($reqtype){
case self::req_web:
$headers =
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
["User-Agent: " . config::USER_AGENT,
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -370,7 +370,7 @@ class youtube{
case self::req_xhr:
$headers =
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:110.0) Gecko/20100101 Firefox/110.0",
["User-Agent: " . config::USER_AGENT,
"Accept: */*",
"Accept-Language: en-US,en;q=0.5",
"Accept-Encoding: gzip",
@ -398,6 +398,8 @@ class youtube{
curl_setopt($curlproc, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curlproc, CURLOPT_TIMEOUT, 30);
$this->backend->assign_proxy($curlproc, $proxy);
$data = curl_exec($curlproc);
if(curl_errno($curlproc)){
@ -430,17 +432,17 @@ class youtube{
$json = fread($handle, filesize("nextpage.json"));
fclose($handle);*/
$npt =
json_decode(
$this->nextpage->get(
[$npt, $proxy] =
$this->backend->get(
$get["npt"],
"videos"
),
true
);
$npt = json_decode($npt, true);
try{
$json = $this->get(
$proxy,
"https://www.youtube.com/youtubei/v1/search",
[
"key" => $npt["key"],
@ -507,6 +509,7 @@ class youtube{
throw new Exception("Search term is empty!");
}
$proxy = $this->backend->get_ip();
$date = $get["date"];
$type = $get["type"];
$duration = $get["duration"];
@ -537,6 +540,7 @@ class youtube{
try{
$json = $this->get(
$proxy,
"https://www.youtube.com/results",
$get
);
@ -942,7 +946,14 @@ class youtube{
if($this->out["npt"] !== null){
$this->out["npt"] = $this->nextpage->store(json_encode($this->out["npt"]), "videos");
$this->out["npt"] =
$this->backend->store(
json_encode(
$this->out["npt"]
),
"videos",
$proxy
);
}
return $this->out;

View file

@ -1,5 +1,7 @@
<?php
include "data/config.php";
/*
Define settings
*/
@ -28,16 +30,7 @@ $settings = [
[
"description" => "Theme",
"parameter" => "theme",
"options" => [
[
"value" => "dark",
"text" => "Gruvbox dark"
],
[
"value" => "cream",
"text" => "Gruvbox cream"
]
]
"options" => []
],
[
"description" => "Prevent clicking background elements when image viewer is open",
@ -59,7 +52,7 @@ $settings = [
"name" => "Scrapers to use",
"settings" => [
[
"description" => "Autocomplete<br><i>Picking <div class=\"code-inline\">Auto</div> changes the source dynamically depending of the page's scraper<br>Picking <div class=\"code-inline\">Disabled</div> disables this feature</i>",
"description" => "Autocomplete<br><i>Picking <span class=\"code-inline\">Auto</span> changes the source dynamically depending of the page's scraper<br><b>Warning:</b> If you edit this field, you will need to re-add the search engine so that the new autocomplete settings are applied!</i>",
"parameter" => "scraper_ac",
"options" => [
[
@ -242,6 +235,26 @@ $settings = [
]
];
/*
Set theme collection
*/
$themes = glob("static/themes/*");
$settings[0]["settings"][1]["options"][] = [
"value" => "Dark",
"text" => "Dark"
];
foreach($themes as $theme){
$theme = explode(".", basename($theme))[0];
$settings[0]["settings"][1]["options"][] = [
"value" => $theme,
"text" => $theme
];
}
/*
Set cookies
*/
@ -262,6 +275,25 @@ if($_POST){
foreach($loop as $key => $value){
if($key == "theme"){
if($value == config::DEFAULT_THEME){
unset($_COOKIE[$key]);
setcookie(
"theme",
"",
[
"expires" => -1, // removes cookie
"samesite" => "Lax",
"path" => "/"
]
);
continue;
}
}else{
foreach($settings as $title){
foreach($title["settings"] as $list){
@ -287,6 +319,7 @@ foreach($loop as $key => $value){
}
}
}
}
if(!is_string($value)){
@ -313,19 +346,13 @@ include "lib/frontend.php";
$frontend = new frontend();
echo
'<!DOCTYPE html>' .
'<html lang="en">' .
'<head>' .
'<meta http-equiv="Content-Type" content="text/html;charset=utf-8">' .
'<title>Settings</title>' .
'<link rel="stylesheet" href="/static/style.css?v4">' .
'<meta name="viewport" content="width=device-width,initial-scale=1">' .
'<meta name="robots" content="index,follow">' .
'<link rel="icon" type="image/x-icon" href="/favicon.ico">' .
'<meta name="description" content="4get.ca: Settings">' .
'<link rel="search" type="application/opensearchdescription+xml" title="4get" href="/opensearch.xml">' .
'</head>' .
'<body' . $frontend->getthemeclass() . '>';
$frontend->load(
"header_nofilters.html",
[
"title" => "Settings",
"class" => ""
]
);
$left =
'<h1>Settings</h1>' .
@ -376,6 +403,14 @@ foreach($settings as $title){
'<div class="title">' . $setting["description"] . '</div>' .
'<select name="' . $setting["parameter"] . '">';
if($setting["parameter"] == "theme"){
if(!isset($_COOKIE["theme"])){
$_COOKIE["theme"] = config::DEFAULT_THEME;
}
}
foreach($setting["options"] as $option){
$left .=

35
sitemap.php Normal file
View file

@ -0,0 +1,35 @@
<?php
header("Content-Type: application/xml");
include "data/config.php";
$domain =
htmlspecialchars(
(strpos(strtolower($_SERVER['SERVER_PROTOCOL']), 'https') === false ? 'http' : 'https') .
'://' . $_SERVER["HTTP_HOST"]
);
echo
'<?xml version="1.0" encoding="UTF-8"?>' .
'<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' .
'<url>' .
'<loc>' . $domain . '/</loc>' .
'<lastmod>2023-07-31T07:56:12+03:00</lastmod>' .
'</url>' .
'<url>' .
'<loc>' . $domain . '/about</loc>' .
'<lastmod>2023-07-31T07:56:12+03:00</lastmod>' .
'</url>' .
'<url>' .
'<loc>' . $domain . '/instances</loc>' .
'<lastmod>2023-07-31T07:56:12+03:00</lastmod>' .
'</url>' .
'<url>' .
'<loc>' . $domain . '/settings</loc>' .
'<lastmod>2023-07-31T07:56:12+03:00</lastmod>' .
'</url>' .
'<url>' .
'<loc>' . $domain . '/api.txt</loc>' .
'<lastmod>2023-07-31T07:56:12+03:00</lastmod>' .
'</url>' .
'</urlset>';

View file

@ -318,11 +318,23 @@ if(image_class !== null){
image_url = htmlspecialchars_decode(image_url);
}
var w = Math.round(click.target.naturalWidth);
var h = Math.round(click.target.naturalHeight);
if(
w === 0 ||
h === 0
){
w = 100;
h = 100;
}
collection = [
{
"url": image_url,
"width": Math.round(click.target.naturalWidth),
"height": Math.round(click.target.naturalHeight)
"width": w,
"height": h
}
];
@ -362,10 +374,22 @@ if(image_class !== null){
var imagesize = elem.getElementsByTagName("img")[0];
var imagesize_w = 0;
var imagesize_h = 0;
if(imagesize.complete){
var imagesize_w = imagesize.naturalWidth;
var imagesize_h = imagesize.naturalHeight;
imagesize_w = imagesize.naturalWidth;
imagesize_h = imagesize.naturalHeight;
}
if(
imagesize_w === 0 ||
imagesize_h === 0
){
imagesize_w = 100;
imagesize_h = 100;
}
for(var i=0; i<collection.length; i++){

495
static/serverping.js Normal file
View file

@ -0,0 +1,495 @@
function htmlspecialchars(str){
if(str === null){
return "<i>&lt;Empty&gt;</i>";
}
var map = {
'&': '&amp;',
'<': '&lt;',
'>': '&gt;',
'"': '&quot;',
"'": '&#039;'
}
return str.replace(/[&<>"']/g, function(m){return map[m];});
}
// initialize garbage
var list = [];
var pinged_list = [];
var reqs = 0;
var errors = 0;
var sort = 0; // lower ping first
// check for instance redirect stuff
var redir = "";
var target = "/web?";
new URL(window.location.href)
.searchParams
.forEach(
function(value, key){
if(key == "target"){
target = "/" + encodeURIComponent(value) + "?";
return;
}
if(key == "npt"){ return; }
redir += encodeURIComponent(key) + "=" + encodeURIComponent(value)
}
);
if(redir != ""){
redir = target + redir;
}
var quote = document.createElement("div");
quote.className = "quote";
quote.innerHTML = 'Pinged <b>0</b> servers (<b>0</b> failed requests)';
var [div_servercount, div_failedreqs] =
quote.getElementsByTagName("b");
var noscript = document.getElementsByTagName("noscript")[0];
document.body.insertBefore(quote, noscript.nextSibling);
// create table
var table = document.createElement("table");
table.innerHTML =
'<thead>' +
'<tr>' +
'<th><div class="arrow up"></div>Ping</th>' +
'<th class="extend">Server</th>' +
'<th>Address</th>' +
'<th>Bot protection</th>' +
'<th title="Amount of legit requests processed since the last APCU cache clear (usually happens at midnight)">Real reqs (?)</th>' +
'<th title="Amount of filtered requests processed since the last APCU cache clear (usually happens at midnight)">Bot reqs (?)</th>' +
'<th>API</th>' +
'<th>Version</th>' +
'</tr>' +
'</thead>' +
'<tbody></tbody>';
document.body.insertBefore(table, quote.nextSibling);
// handle sorting clicks
var tbody = table.getElementsByTagName("tbody")[0];
var th = table.getElementsByTagName("th");
for(var i=0; i<th.length; i++){
th[i].addEventListener("click", function(event){
if(event.target.className.includes("arrow")){
var div = event.target.parentElement;
}else{
var div = event.target;
}
var arrow = div.getElementsByClassName("arrow");
var orientation = 0; // up
if(arrow.length === 0){
// delete arrow and add new one
arrow = document.getElementsByClassName("arrow");
arrow[0].remove();
arrow = document.createElement("div");
arrow.className = "arrow up";
div.insertBefore(arrow, event.target.firstChild);
}else{
// switch arrow position
if(arrow[0].className == "arrow down"){
arrow[0].className = "arrow up";
}else{
arrow[0].className = "arrow down";
orientation = 1;
}
}
switch(div.textContent.toLowerCase()){
case "ping": sort = orientation; break;
case "server": sort = 2 + orientation; break;
case "address": sort = 4 + orientation; break;
case "bot protection": sort = 6 + orientation; break;
case "real reqs (?)": sort = 8 + orientation; break;
case "bot reqs (?)": sort = 10 + orientation; break;
case "api": sort = 12 + orientation; break;
case "version": sort = 14 + orientation; break;
}
render_list();
});
}
function validate_url(url, allow_http = false){
try{
url = new URL(url);
if(
url.protocol == "https:" ||
(
(
allow_http === true ||
window.location.protocol == "http:"
) &&
url.protocol == "http:"
)
){
return true;
}
}catch(error){} // do nothing
return false;
}
function number_format(int){
return new Intl.NumberFormat().format(int);
}
// parse initial server list
fetch_server(window.location.origin);
async function fetch_server(server){
if(!validate_url(server)){
console.warn("Invalid server URL: " + server);
return;
}
// make sure baseURL is origin
server = new URL(server).origin;
// prevent multiple fetches
for(var i=0; i<list.length; i++){
if(list[i] == server){
// serber was already fetched
console.info("Already checked server: " + server);
return;
}
}
// prevent future fetches
list.push(server);
var data = null;
var ping = new Date().getTime();
try{
data = await fetch(
server + "/ami4get"
);
if(data.status !== 200){
// endpoint is not available
errors++;
div_failedreqs.textContent = number_format(errors);
console.warn(server + ": Invalid HTTP code " + data.status);
return;
}
data = await data.json();
data.server.ping = new Date().getTime() - ping;
}catch(error){
errors++;
div_failedreqs.textContent = number_format(errors);
console.warn(server + ": Could not fetch or decode JSON");
return;
}
// sanitize data
if(
typeof data.status != "string" ||
data.status != "ok" ||
typeof data.server != "object" ||
!(
typeof data.server.name == "string" ||
(
typeof data.server.name == "object" &&
data.server.name === null
)
) ||
typeof data.service != "string" ||
data.service != "4get" ||
(
typeof data.server.description != "string" &&
data.server.description !== null
) ||
typeof data.server.bot_protection != "number" ||
typeof data.server.real_requests != "number" ||
typeof data.server.bot_requests != "number" ||
typeof data.server.api_enabled != "boolean" ||
typeof data.server.alt_addresses != "object" ||
typeof data.server.version != "number" ||
typeof data.instances != "object"
){
errors++;
div_failedreqs.textContent = number_format(errors);
console.warn(server + ": Malformed JSON");
return;
}
data.server.ip = server;
reqs++;
div_servercount.textContent = number_format(reqs);
var total = pinged_list.push(data) - 1;
pinged_list[total].index = total;
render_list();
// get more serbers
for(var i=0; i<data.instances.length; i++){
fetch_server(data.instances[i]);
}
}
function sorta(object, element, order){
return object.slice().sort(
function(a, b){
if(order){
return a.server[element] - b.server[element];
}
return b.server[element] - a.server[element];
}
);
}
function textsort(object, element, order){
var sort = object.slice().sort(
function(a, b){
return a.server[element].localeCompare(b.server[element]);
}
);
if(!order){
return sort.reverse();
}
return sort;
}
function render_list(){
var sorted_list = [];
// sort
var filter = Boolean(sort % 2);
switch(sort){
case 0:
case 1:
sorted_list = sorta(pinged_list, "ping", filter === true ? false : true);
break;
case 2:
case 3:
sorted_list = textsort(pinged_list, "name", filter === true ? false : true);
break;
case 4:
case 5:
sorted_list = textsort(pinged_list, "ip", filter === true ? false : true);
break;
case 6:
case 7:
sorted_list = sorta(pinged_list, "bot_protection", filter === true ? false : true);
break;
case 8:
case 9:
sorted_list = sorta(pinged_list, "real_requests", filter);
break;
case 10:
case 11:
sorted_list = sorta(pinged_list, "bot_requests", filter);
break;
case 12:
case 13:
sorted_list = sorta(pinged_list, "api_enabled", filter);
break;
case 14:
case 15:
sorted_list = sorta(pinged_list, "version", filter);
break;
}
// render tabloid
var html = "";
for(var k=0; k<sorted_list.length; k++){
html += '<tr onclick="show_server(' + sorted_list[k].index + ');">';
for(var i=0; i<8; i++){
html += '<td';
switch(i){
case 0: // server ping
if(sorted_list[k].server.ping <= 100){
html += '><span style="color:var(--green);">' + sorted_list[k].server.ping + '</span>';
break;
}
if(sorted_list[k].server.ping <= 200){
html += '><span style="color:var(--yellow);">' + sorted_list[k].server.ping + '</span>';
break;
}
html += '><span style="color:var(--red);">' + number_format(sorted_list[k].server.ping) + '</span>';
break;
// server name
case 1: html += ' class="extend">' + htmlspecialchars(sorted_list[k].server.name); break;
case 2: html += '>' + htmlspecialchars(new URL(sorted_list[k].server.ip).host); break;
case 3: // bot protection
switch(sorted_list[k].server.bot_protection){
case 0:
html += '><span style="color:var(--green);">Disabled</span>';
break;
case 1:
html += '><span style="color:var(--yellow);">Image captcha</span>';
break;
case 2:
html += '><span style="color:var(--red);">Invite only</span>';
break;
default:
html += '>Unknown';
}
break;
case 4: // real reqs
html += '>' + number_format(sorted_list[k].server.real_requests);
break;
case 5: // bot reqs
html += '>' + number_format(sorted_list[k].server.bot_requests);
break;
case 6: // api enabled
if(sorted_list[k].server.api_enabled){
html += '><span style="color:var(--green);">Yes</span>';
}else{
html += '><span style="color:var(--red);">No</span>';
}
break;
// version
case 7: html += ">v" + sorted_list[k].server.version; break;
}
html += '</td>';
}
html += '</tr>';
}
tbody.innerHTML = html;
}
var popup_bg = document.getElementById("popup-bg");
var popup_wrapper = document.getElementsByClassName("popup-wrapper")[0];
var popup = popup_wrapper.getElementsByClassName("popup")[0];
var popup_shown = false;
popup_bg.addEventListener("click", function(){
popup_wrapper.style.display = "none";
popup_bg.style.display = "none";
});
function show_server(serverid){
var html =
'<h2>' + htmlspecialchars(pinged_list[serverid].server.name) + '</h2>' +
'Description' +
'<div class="code">' + htmlspecialchars(pinged_list[serverid].server.description) + '</div>';
var url_obj = new URL(pinged_list[serverid].server.ip);
var url = htmlspecialchars(url_obj.origin);
var domain = url_obj.hostname;
html +=
'URL: <a rel="noreferer" target="_BLANK" href="' + url + redir + '">' + url + '</a> <a rel="noreferer" target="_BLANK" href="https://browserleaks.com/ip/' + encodeURIComponent(domain) + '">(IP lookup)</a>' +
'<br><br>Alt addresses:';
var len = pinged_list[serverid].server.alt_addresses.length;
if(len === 0){
html += ' <i>&lt;Empty&gt;</i>';
}else{
html += '<ul>';
for(var i=0; i<len; i++){
var url_obj = new URL(pinged_list[serverid].server.alt_addresses[i]);
var url = htmlspecialchars(url_obj.origin);
var domain = url_obj.hostname;
if(validate_url(pinged_list[serverid].server.alt_addresses[i], true)){
html += '<li><a rel="noreferer" href="' + url + redir + '" target="_BLANK">' + url + '</a> <a rel="noreferer" target="_BLANK" href="https://browserleaks.com/ip/' + encodeURIComponent(domain) + '">(IP lookup)</a></li>';
}else{
console.warn(pinged_list[serverid].server.ip + ": Invalid peer URL => " + pinged_list[serverid].server.alt_addresses[i]);
}
}
html += '</ul>';
}
popup.innerHTML = html;
popup_wrapper.style.display = "block";
popup_bg.style.display = "block";
}
function hide_server(){
popup_wrapper.style.display = "none";
popup_bg.style.display = "none";
}

View file

@ -1,7 +1,3 @@
/*
Global styles
*/
:root{
/* background */
--1d2021: #1d2021;
@ -21,31 +17,11 @@
--default: #d4be98;
--keyword: #d8a657;
--string: #7daea7;
}
.theme-white{
/* background */
--1d2021: #bdae93;
--282828: #a89984;
--3c3836: #a89984;
--504945: #504945;
/* font */
--928374: #1d2021;
--a89984: #282828;
--bdae93: #3c3836;
--8ec07c: #52520e;
--ebdbb2: #1d2021;
/* code highlighter */
--comment: #6a4400;
--default: #d4be98;
--keyword: #4a4706;
--string: #076678;
}
.theme-white .autocomplete .entry:hover{
background:#928374;
/* color codes for instance list */
--green: #b8bb26;
--yellow: #d8a657;
--red: #fb4934;
}
audio{
@ -516,6 +492,7 @@ h3,h4,h5,h6{
.web .favicon img,
.favicon-dropdown img{
margin:3px 7px 0 0;
width:16px;
height:16px;
font-size:12px;
line-height:16px;
@ -1020,6 +997,7 @@ table tr a:last-child{
cursor:grab;
user-select:none;
pointer-events:none;
z-index:5;
}
#popup:active{
@ -1046,6 +1024,7 @@ table tr a:last-child{
height:35px;
background:var(--1d2021);
border-bottom:1px solid var(--928374);
z-index:4;
}
#popup-bg{
@ -1057,6 +1036,7 @@ table tr a:last-child{
width:100%;
height:100%;
display:none;
z-index:3;
}
#popup-status select{
@ -1166,6 +1146,108 @@ table tr a:last-child{
color:var(--string);
}
/*
Instances page
*/
.instances table{
white-space:nowrap;
margin-top:17px;
}
.instances a{
color:var(--bdae93);
}
.instances tbody tr:nth-child(even){
background:var(--282828);
}
.instances thead{
outline:1px solid var(--928374);
outline-offset:-1px;
background:var(--3c3836);
user-select:none;
z-index:2;
position:sticky;
top:0;
}
.instances th{
cursor:row-resize;
}
.instances th:hover{
background:var(--504945);
}
.instances tbody{
outline:1px solid var(--504945);
outline-offset:-1px;
position:relative;
top:-1px;
}
.instances tbody tr:hover{
background:var(--3c3836);
cursor:pointer;
}
.instances .arrow{
display:inline-block;
position:relative;
top:6px;
margin-right:7px;
width:0;
height:0;
border:6px solid transparent;
border-top:10px solid var(--bdae93);
}
.instances .arrow.up{
top:0;
border:6px solid transparent;
border-bottom:10px solid var(--bdae93);
}
.instances th, .instances td{
padding:4px 7px;
width:0;
}
.instances .extend{
width:unset;
overflow:hidden;
max-width:200px;
}
.instances .popup-wrapper{
display:none;
position:fixed;
left:50%;
top:50%;
transform:translate(-50%, -50%);
width:800px;
max-width:100%;
max-height:100%;
overflow-x:auto;
padding:17px;
box-sizing:border-box;
pointer-events:none;
z-index:3;
}
.instances .popup{
border:1px solid var(--928374);
background:var(--282828);
padding:7px 10px;
pointer-events:initial;
}
.instances ul{
padding-left:20px;
}
/*
Responsive image
*/
@ -1221,7 +1303,7 @@ table tr a:last-child{
width:100%;
}
table td{
body:not(.instances) table td{
display:block;
width:100%;
}

31
static/themes/Cream.css Normal file
View file

@ -0,0 +1,31 @@
:root{
/* background */
--1d2021: #bdae93;
--282828: #a89984;
--3c3836: #a89984;
--504945: #504945;
/* font */
--928374: #1d2021;
--a89984: #282828;
--bdae93: #3c3836;
--8ec07c: #52520e;
--ebdbb2: #1d2021;
/* code highlighter */
--comment: #6a4400;
--default: #d4be98;
--keyword: #4a4706;
--string: #076678;
/* color codes for instance list */
--green: #636311;
--yellow: #8a6214;
--red: #711410;
}
.autocomplete .entry:hover,
.instances th:hover
{
background:#928374;
}

77
template/about.html Normal file
View file

@ -0,0 +1,77 @@
<a href="/" class="link">&lt; Go back</a>
<h1>Set as default search engine</h1>
<a href="#firefox"><h2 id="firefox">On Firefox and other Gecko based browsers</h2></a>
To set this as your default search engine on Firefox, right click the URL bar and select <div class="code-inline">Add "4get"</div>. Then, visit <a href="about:preferences#search" target="_BLANK" class="link">about:preferences#search</a> and select <div class="code-inline">4get</div> in the dropdown menu.
<a href="#chrome"><h2 id="chrome">On Chromium and Blink based browsers</h2></a>
Click the 3 superpositioned dots at the top right of the screen and click on <div class="code-inline">Settings</div>, then search for <div class="code-inline">default search engine</div>, or visit <a href="chrome://settings/searchEngines">chrome://settings/searchEngines</a>.<br><br>
Once you're there, click the pencil on the last entry under "Search engines" (it's probably DuckDuckGo). Once you do that, a popup will appear. Populate it with the following information:
<table>
<tr>
<td><b>Field</b></td>
<td><b>Value</b></td>
</tr>
<tr>
<td>Search engine</td>
<td>{%server_name%}</td>
</tr>
<tr>
<td>Shortcut</td>
<td>{%server_name%}</td>
</tr>
<tr>
<td>URL with %s in place of query</td>
<td>https://4get.ca/web?s=%s</td>
</tr>
</table>
Once that's done, click <div class="code-inline">Save</div>. Then, on the right handside of the newly created entry, open the dropdown menu and select <div class="code-inline">Make default</div>.
<h1>Frequently asked questions</h1>
<a href="#what-is-this"><h2 id="what-is-this">What is this?</h2></a>
This is a metasearch engine that gets results from other engines, and strips away all of the tracking parameters and Microsoft/globohomo bullshit they add. Most of the other alternatives to Google jack themselves off about being ""privacy respecting"" or whatever the fuck but it always turns out to be a total lie, and I just got fed up with their shit honestly. Alternatives like Searx or YaCy all fucking sucks so I made my own thing.
<a href="#goal"><h2 id="goal">My goal</h2></a>
Provide users with a privacy oriented, extremely lightweight, ad free, free as in freedom (and free beer!) way to search for documents around the internet, with minimal, optional javascript code. My long term goal would be to build my own index (that doesn't suck) and provide users with an unbiased search engine, with no political inclinations.
<a href="#logs"><h2 id="logs">Do you keep logs?</h2></a>
I store data temporarly to get the next page of results. This might include search queries, tokens and other parameters. These parameters are encrypted using <div class="code-inline">aes-256-gcm</div> on the serber, for which I give you a key (also known internally as <div class="code-inline">npt</div> token). When you make a request to get the next page, you supply the token, the data is decrypted and the request is fulfilled. This encrypted data is deleted after 15 minutes, or after it's used, whichever comes first.<br><br>
I <b>don't</b> log IP addresses, user agents, or anything else. The <div class="code-inline">npt</div> tokens are the only thing that are stored (in RAM, mind you), temporarly, encrypted.
<a href="#information-sharing"><h2 id="information-sharing">Do you share information with third parties?</h2></a>
Your search queries and supplied filters are shared with the scraper you chose (so I can get the search results, duh). I don't share anything else (that means I don't share your IP address, location, or anything of this kind). There is no way that site can know you're the one searching for something, <u>unless you send out a search query that de-anonymises you.</u> For example, a search query like "hello my full legal name is jonathan gallindo and i want pictures of cloacas" would definitively blow your cover. 4get doesn't contain ads or any third party javascript applets or trackers. I don't profile you, and quite frankly, I don't give a shit about what you search on there.<br><br>
TL;DR assume those websites can see what you search for, but can't see who you are (unless you're really dumb).
<a href="#hosting"><h2 id="hosting">Where is this website hosted?</h2></a>
This website is hosted on a Contabo shitbox in the United States.
<a href="#keyboard-shortcuts"><h2 id="keyboard-shortcuts">Keyboard shortcuts?</h2></a>
Use <div class="code-inline">/</div> to focus the search box.<br><br>
When the image viewer is open, you can use the following keybinds:<br>
<div class="code-inline">Up</div>, <div class="code-inline">Down</div>, <div class="code-inline">Left</div>, <div class="code-inline">Right</div> to rotate the image.<br>
<div class="code-inline">CTRL+Up</div>, <div class="code-inline">CTRL+Down</div>, <div class="code-inline">CTRL+Left</div>, <div class="code-inline">CTRL+Right</div> to mirror the image.<br>
<div class="code-inline">Escape</div> to exit the image viewer.
<a href="#schizo"><h2 id="schizo">How can I trust you?</h2></a>
You just sort of have to take my word for it right now. If you'd rather trust yourself instead of me (I believe in you!!), all of the code on this website is available trough my <a href="https://git.lolcat.ca/lolcat" class="link">git page</a> for you to host on your own machines. Just a reminder: if you're the sole user of your instance, it doesn't take immense brain power for Microshit to figure out you basically just switched IP addresses. Invite your friends to use your instance!
<a href="#donate"><h2 id="donate">Support the project</h2></a>
Donate to me trough ko-fi: <a href="https://ko-fi.com/lolcat" target="BLANK" rel="noreferrer">ko-fi.com/lolcat</a><br>
Please donate I sent myself a donation for testing if it works and it looks fucking dumb. Reasons to donate are listed on there. Thank you!
<a href="#contact"><h2 id="contact">I want to report abuse or have erotic roleplay trough email</h2></a>
I don't know about that second part but if you want to talk to me, just drop me an email...<br><br>
<b>Message to all DMCA enforcers:</b> I don't host any of the content. Everything you see here is <u>proxied</u> trough my shitbox with no moderation. Please reach out to the people hosting the infringing content instead.<br><br>
<a href="https://lolcat.ca" rel="dofollow" class="link">Click here to contact me!</a><br><br>
<a href="https://validator.w3.org/nu/?doc=https%3A%2F%2F4get.ca" title="W3 Valid!">
<img src="/static/icon/w3html.png" alt="Valid W3C HTML 4.01" width="88" height="31">
</a>

View file

@ -3,14 +3,15 @@
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>{%title%}</title>
<link rel="stylesheet" href="/static/style.css?v4">
<link title="{%server_name%}" href="/opensearch{%ac%}" rel="search" type="application/opensearchdescription+xml">
<link rel="stylesheet" href="/static/style.css?v{%version%}">
{%style%}
<meta name="viewport" content="width=device-width,initial-scale=1">
<meta name="robots" content="{%index%}index,{%index%}follow">
<link rel="icon" type="image/x-icon" href="/favicon.ico">
<meta name="description" content="4get.ca: {%description%}">
<link rel="search" type="application/opensearchdescription+xml" title="4get" href="/opensearch.xml">
<meta name="description" content="{%server_name%}: {%description%}">
</head>
<body{%body_class%}>
<body>
<form method="GET" autocomplete="off">
<div class="searchbox">
<input type="submit" value="Search" tabindex="-1">

View file

@ -0,0 +1,14 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>{%title%}</title>
<link title="{%server_name%}" href="/opensearch{%ac%}" rel="search" type="application/opensearchdescription+xml">
<link rel="stylesheet" href="/static/style.css?v{%version%}">
{%style%}
<meta name="viewport" content="width=device-width,initial-scale=1">
<meta name="robots" content="index,follow">
<link rel="icon" type="image/x-icon" href="/favicon.ico">
<meta name="description" content="{%server_name%}: {%title%}">
</head>
<body{%class%}>

View file

@ -2,15 +2,17 @@
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>4get</title>
<title>{%server_name%}</title>
<link title="{%server_name%}" href="/opensearch{%ac%}" rel="search" type="application/opensearchdescription+xml">
<link rel="sitemap" type="application/xml" title="Sitemap" href="/sitemap">
<meta name="viewport" content="width=device-width,initial-scale=1">
<link rel="stylesheet" href="/static/style.css?v4">
<link rel="stylesheet" href="/static/style.css?v{%version%}">
{%style%}
<meta name="robots" content="index,follow">
<link rel="icon" type="image/x-icon" href="/favicon.ico">
<meta name="description" content="4get.ca: They live in our walls!">
<link rel="search" type="application/opensearchdescription+xml" title="4get" href="/opensearch.xml">
<meta name="description" content="{%server_name%}: {%server_short_description%}">
</head>
<body class="home {%body_class%}">
<body class="home">
<div id="center">
<form method="GET" autocomplete="off" action="web">
<div class="logo">
@ -26,13 +28,12 @@
<div class="autocomplete"></div>
</div>
</form>
<a href="settings">Settings</a><a href="api.txt">API</a><a href="about">About</a><a href="https://git.lolcat.ca/lolcat/4get">Source</a><a href="https://ko-fi.com/lolcat" rel="noreferrer" target="BLANK">Donate</a>
<a href="settings">Settings</a><a href="instances">Instances</a><a href="api.txt">API</a><a href="about">About</a><a href="https://git.lolcat.ca/lolcat/4get">Source</a><a href="https://ko-fi.com/lolcat" rel="noreferrer" target="BLANK">Donate</a>
<div class="subtext">
Clearnet: <a href="https://4get.ca">4get.ca</a><br>
Tor: <a href="http://4getwebfrq5zr4sxugk6htxvawqehxtdgjrbcn2oslllcol2vepa23yd.onion">4getwebfrq5zr4sxugk6htxvawqehxtdgjrbcn2oslllcol2vepa23yd.onion</a><br>
Report a problem: <a href="https://lolcat.ca">lolcat.ca</a>
<a href="https://4get.ca">Clearnet</a><a href="http://4getwebfrq5zr4sxugk6htxvawqehxtdgjrbcn2oslllcol2vepa23yd.onion">Tor</a><a href="https://lolcat.ca">Report a problem</a><br>
Running on <b>v{%version%}</b>!!
</div>
</div>
<script src="/static/client.js?v4"></script>
<script src="/static/client.js?v{%version%}"></script>
</body>
</html>

View file

@ -2,6 +2,6 @@
{%images%}
</div>
{%nextpage%}
<script src="/static/client.js?v3"></script>
<script src="/static/client.js?v{%version%}"></script>
</body>
</html>

36
template/instances.html Normal file
View file

@ -0,0 +1,36 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>Instance browser</title>
<link title="{%server_name%}" href="/opensearch{%ac%}" rel="search" type="application/opensearchdescription+xml">
<link rel="stylesheet" href="/static/style.css?v{%version%}">
{%style%}
<meta name="viewport" content="width=device-width,initial-scale=1">
<meta name="robots" content="index,follow">
<link rel="icon" type="image/x-icon" href="/favicon.ico">
<meta name="description" content="{%server_name%}: Instances">
</head>
<body class="instances">
<h1>Instance browser</h1>
Learn how to setup your own instance here! <a href="https://git.lolcat.ca/lolcat/4get" target="_BLANK">https://git.lolcat.ca/lolcat/4get</a>
<noscript>
<div class="quote">For a better experience, whitelist javascript usage on this page.</div>
<table>
<thead>
<tr>
<th class="expand">Server</th>
</tr>
</thead>
<tbody>
{%instances_html%}
</tbody>
</table>
</noscript>
<div id="popup-bg"></div>
<div class="popup-wrapper">
<div class="popup"></div>
</div>
<script src="static/serverping.js?v{%version%}"></script>
</body>
</html>

View file

@ -11,6 +11,6 @@
{%left%}
</div>
</div>
<script src="/static/client.js?v4"></script>
<script src="/static/client.js?v{%version%}"></script>
</body>
</html>

View file

@ -3,6 +3,8 @@
/*
Initialize random shit
*/
include "data/config.php";
include "lib/frontend.php";
$frontend = new frontend();
@ -28,20 +30,7 @@ try{
}catch(Exception $error){
echo
$frontend->drawerror(
"Shit",
'This scraper returned an error:' .
'<div class="code">' . htmlspecialchars($error->getMessage()) . '</div>' .
'Things you can try:' .
'<ul>' .
'<li>Use a different scraper</li>' .
'<li>Remove keywords that could cause errors</li>' .
'<li>Use another 4get instance</li>' .
'</ul><br>' .
'If the error persists, please <a href="/about">contact the administrator</a>.'
);
die();
$frontend->drawscrapererror($error->getMessage(), $get, "videos");
}
$categories = [

17
web.php
View file

@ -3,6 +3,8 @@
/*
Initialize random shit
*/
include "data/config.php";
include "lib/frontend.php";
$frontend = new frontend();
@ -28,20 +30,7 @@ try{
}catch(Exception $error){
echo
$frontend->drawerror(
"Shit",
'This scraper returned an error:' .
'<div class="code">' . htmlspecialchars($error->getMessage()) . '</div>' .
'Things you can try:' .
'<ul>' .
'<li>Use a different scraper</li>' .
'<li>Remove keywords that could cause errors</li>' .
'<li>Use another 4get instance</li>' .
'</ul><br>' .
'If the error persists, please <a href="/about">contact the administrator</a>.'
);
die();
$frontend->drawscrapererror($error->getMessage(), $get, "web");
}
/*