Compare commits

..

No commits in common. "stable" and "dev" have entirely different histories.
stable ... dev

9 changed files with 164 additions and 317 deletions

View file

@ -12,11 +12,11 @@ Get notifications from your feed reader when your favourite thread is about to d
4) Profit! RSS feed will include only the threads matching your querry so every notification your feed reader will send means your 4) Profit! RSS feed will include only the threads matching your querry so every notification your feed reader will send means your
watched thread is about to die watched thread is about to die
*NOTE THAT THIS IS AN BETA RELEASE, IF YOUR THREAD DIES BECAUSE OF RSS-WATCHER MALLFUNCTION DO NOT BLAME ME* pls *NOTE THAT THIS IS AN ALPHA RELEASE, IF YOUR THREAD DIES BECAUSE OF RSS-WATCHER MALLFUNCTION DO NOT BLAME ME* pls
** Getting custom URL ** Getting custom URL
URL without any params (just ~/feed~) won't work. You must specify at least one ~q~ or ~Q~. See bellow. URL without any params (just ~/feed.xml~) won't work. You must specify at least one ~q~. See bellow.
*** Crafting URL by hand *** Crafting URL by hand
@ -24,14 +24,13 @@ Right now there is no automated way to generate your feed url but making one by
**** URL parameters **** URL parameters
Please note that default values may vary depending on which instance/host/board you use, these are the defaults that come with Please note that default values may vary depending on which host you use, these are the defaults that come with this software but
this software but anyone running instance of RSS thread watcher can change them anyone running instance of RSS thread watcher can change them
| Param name | Values [default] | Can have multiple? | Mandatory? | Short description | | Param name | Values [default] | Can have multiple? | Mandatory? | Short description |
|------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------| |------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------|
| board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work | | board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work |
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet | | q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet |
| Q | nil | Yes | No if ~q~ is present | This string is used to filter threads according to their titles, but is CaseSensitive |
| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod | | chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod |
| repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod | | repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod |
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) | | recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
@ -41,14 +40,14 @@ this software but anyone running instance of RSS thread watcher can change them
Standart rules of URLs apply, if you know how to pass params in URL to any website, you don't even have to read this Standart rules of URLs apply, if you know how to pass params in URL to any website, you don't even have to read this
- Open some text editor - Open some text editor
- Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed?~ (you can use plain HTTP if you want to) - Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed.xml?~ (you can use plain HTTP if you want to)
- Now you can append any of the supported parameters (which you can find in the above table): - Now you can append any of the supported parameters (which you can find in the above table):
- For example if we want to be informed about threads with "cute" in their title - For example if we want to be informed about threads with "cute" in their title
- ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed?q=cute~ - ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute~
- If you want more than one param, separate with ~&~, for example: - If you want more than one param, separate with ~&~, for example:
- ~q=cute~ and ~q=pretty~ would be ~https://tools.treebrary.org/thread-watcher/feed?q=cute&q=pretty~ - ~q=cute~ and ~q=pretty~ would be ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty~
- Same is true for when you also want to specify ChoD - Same is true for when you also want to specify ChoD
- ~https://tools.treebrary.org/thread-watcher/feed?q=cute&q=pretty&chod=98~ - ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty&chod=98~
- This will only notify you about threads that: - This will only notify you about threads that:
- Have ~cute~ or ~pretty~ in their title - Have ~cute~ or ~pretty~ in their title
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off) - Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
@ -79,9 +78,7 @@ See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues?q=&type=
- [ ] Support notification for thread death - [ ] Support notification for thread death
- [X] Support multiple boards at once - [X] Support multiple boards at once
- [ ] Support async responses - [ ] Support async responses
- [ ] Graal VM support for native compilation - [ ] Graal VM support for native configuration
For more up to date and complete list of features, check [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/projects][open projects]].
** Self hosting ** Self hosting
@ -90,14 +87,8 @@ options.
*** Prebuilt *** Prebuilt
Download newest release from [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/releases][releases]] and run them like you would any other java executable (example bellow). There are two JARs, Download newest release from [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/releases][releases]] and run them like you would any other java executable, default port is ~6969~
first requires you to have Clojure installed (~rss-thread-watch-[version]~), second one doesn't (the one with ~standalone~ in ~$ java -jar whatEverNameTheReleaseHas.jar~~
name).
~$ java -jar whatEverNameTheReleaseHas.jar~
Default port is ~6969~ so either change it or allow it in your firewall. HTTPS is not supported, so using some reverse proxy like
Nginx could be the easiest solution for that.
*** From source *** From source
Not officially supported, if you'll attempt this, please, use source from release tarball or checkout ~release~ or ~stable~ Not officially supported, if you'll attempt this, please, use source from release tarball or checkout ~release~ or ~stable~

View file

@ -1,4 +1,4 @@
(defproject rss-thread-watch "0.4.9-SNAPSHOT" (defproject rss-thread-watch "0.4.0-SNAPSHOT"
:description "RSS based thread watcher" :description "RSS based thread watcher"
:url "http://example.com/FIXME" :url "http://example.com/FIXME"
:license {:name "AGPL-3.0-only" :license {:name "AGPL-3.0-only"
@ -9,6 +9,6 @@
[clj-rss "0.4.0"] [clj-rss "0.4.0"]
[org.clojure/data.json "2.4.0"] [org.clojure/data.json "2.4.0"]
[org.clojure/tools.cli "1.1.230"]] [org.clojure/tools.cli "1.1.230"]]
:main rss-thread-watch.core :main ^:skip-aot rss-thread-watch.core
:target-path "target/%s" :target-path "target/%s"
:profiles {:uberjar {:aot :all}}) :profiles {:uberjar {:aot :all}})

View file

@ -3,11 +3,7 @@
;; Message displayed when requested board is not enabled ;; Message displayed when requested board is not enabled
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]" :board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]"
;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED ;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED
;; The watcher feed will be served by this url, everything else will be 404
:served-filename "/feed"
;; This is homepage for your feed, it should probably redirect somewhere where you mention
;; What things you have enabled and where to find full docs
:homepage "https://git.treebrary.org/Treebrary.org/rss-thread-watcher"
;; This map defines default values for all enabled boards, if you wish for some board ;; This map defines default values for all enabled boards, if you wish for some board
;; to use different values, specify them bellow in :borads-enabled ;; to use different values, specify them bellow in :borads-enabled
:boards-defaults { :boards-defaults {
@ -18,12 +14,10 @@
:starting-page 7 :starting-page 7
;; Default ChOD to use if none is specified by the user ;; Default ChOD to use if none is specified by the user
:default-chod 94 :default-chod 94
;; This is target for Catalog API requests ;; If you want to do some preprocessing beforehand, you can override
;; {board} will be substitued for board ;; Target URL for the board, but the response must be same the 4chan API would return
:target "https://api.4chan.org/{board}/catalog.json" ;; /$board/catalog.json will be appended to this link
;; This is where threads actually reside if different from :target :target "https://api.4chan.org"
;; you can use {board} and {threadnum} for substitutions
:host "https://boards.4chan.org/{board}/thread/{threadnum}"
;; Commented parts bellow are still unimplemented ;; Commented parts bellow are still unimplemented
;; ------ ;; ------
;; Only download catalog when someone requests feed and cache is old ;; Only download catalog when someone requests feed and cache is old

View file

@ -1,108 +0,0 @@
;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
;; the Free Software Foundation, version 3 of the License.
;;
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU Affero General Public License for more details.
;;
;; You should have received a copy of the GNU Affero General Public License
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.config
"Functions for working with configuration"
(:require [clojure.java.io :as io]
[clojure.edn :as edn]
[clojure.string :as s]
[rss-thread-watch.utils :as u])
(:gen-class))
;; Verification TODO: check if all required keys are included so we don't get nils
(def VERSION "0.4.9")
(def GLOBAL-CONFIG
"Global config with defaults for missing entires"
;; I know globals are ew in Clojure but I don't know any
;; better way of doing this
(atom nil))
;; Internal default config
(def CONFIG-DEFAULT
"Internal default config"
{:port 6969
:default-board "/mlp/"
:enable-board-listing true
:served-filename "/feed"
:homepage "https://git.treebrary.org/Treebrary.org/rss-thread-watcher"
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
:boards-defaults {:refresh-rate 300
:starting-page 7
:default-chod 94
:target "https://api.4chan.org/{board}/catalog.json"
:host "https://boards.4chan.org/{board}/thread/{threadnum}"
:lazy-load true}
:boards-enabled {"/mlp/" {:lazy-load false}
"/g/" {:starting-page 7}
"/po/" {:starting-page 8
:refresh-rate 86400}
"/p/" {:starting-page 8
:refresh-rate 1800}}})
(defn load-config
"Attempts to load config from file [f].
Returns loaded config map or nil if failed"
[f]
(let [fl (io/as-file f)]
(when (.exists fl)
(with-open [r (io/reader fl)]
(edn/read (java.io.PushbackReader. r))))))
(defn config-url-expand
"Expands substitution in :target and :host fields"
[filled-config]
(let [boards (get filled-config :boards-enabled)
selecting '(:target :host)
pattern "{board}"]
(assoc filled-config
:boards-enabled
(u/fmap (fn [board confs]
(->> (select-keys confs selecting)
(u/fmap (fn [_ v]
(s/replace v pattern (s/replace board "/" ""))))
(merge confs)))
boards))))
(defn config-fill-board-defaults
;; TODO: must have check that if board is default, it's enabled, if it's not, give some big fat warning
;; that users must always specify board, maybe change the error?
"Fills every enabled board with default config values"
[config]
(let [defaults (:boards-defaults config)]
(as-> config conf
(update-in conf
'(:boards-enabled)
(fn [mp]
(u/fmap (fn [k v]
(assoc (u/map-apply-defaults v defaults) :name k))
mp)))
(dissoc conf :boards-defaults)
(config-url-expand conf))))
(defn get-some-config
"Attempts to get config somehow,
first from [custom-file], if it's nil,
then from ./config.edn file.
If is neither exists, default internal one is used."
[custom-file]
(config-fill-board-defaults
;; TODO: There has to be try/catch for when file is invalid edn
;; This is gonna be done when config validation comes in Beta 2
(let [file-to-try (u/nil?-else custom-file
"./config.edn")]
(u/when-else (load-config file-to-try)
CONFIG-DEFAULT))))

View file

@ -13,15 +13,36 @@
;; along with this program. If not, see <https://www.gnu.org/licenses/>. ;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.core (ns rss-thread-watch.core
(:require [clojure.tools.cli :refer [parse-opts]] (:require [clojure.java.io :as io]
[clojure.edn :as edn]
[clojure.tools.cli :refer [parse-opts]]
[ring.adapter.jetty :as jetty] [ring.adapter.jetty :as jetty]
[ring.middleware.params :as rp] [ring.middleware.params :as rp]
[rss-thread-watch.watcher :as watcher] [rss-thread-watch.watcher :as watcher]
[rss-thread-watch.feed-generator :as feed] [rss-thread-watch.feed-generator :as feed]
[rss-thread-watch.utils :as u] [rss-thread-watch.utils :as u])
[rss-thread-watch.config :as conf])
(:gen-class)) (:gen-class))
(def VERSION "0.4.0")
;; Internal default config
(def CONFIG-DEFAULT
"Internal default config"
{:port 6969
:default-board "/mlp/"
:enable-board-listing true
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
:boards-defaults {:refresh-rate 300
:starting-page 7
:default-chod 94
:target "https://api.4chan.org"
:lazy-load true}
:boards-enabled {"/mlp/" {:lazy-load false}
"/g/" {:starting-page 7}
"/po/" {:starting-page 8
:refresh-rate 86400}
"/p/" {:starting-page 8
:refresh-rate 1800}}})
(def cli-options (def cli-options
"Configuration defining program arguments for cli.tools" "Configuration defining program arguments for cli.tools"
@ -45,6 +66,41 @@
(println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes")))) (println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes"))))
(Thread/sleep ms))))) (Thread/sleep ms)))))
(defn load-config
"Attempts to load config from file [f].
Returns loaded config map or nil if failed"
[f]
(let [fl (io/as-file f)]
(when (.exists fl)
(with-open [r (io/reader fl)]
(edn/read (java.io.PushbackReader. r))))))
(defn config-fill-board-defaults
"Fills every enabled board with default config values"
[config]
(let [defaults (:boards-defaults config)]
(dissoc (update-in config
'(:boards-enabled)
(fn [mp]
(u/fmap (fn [k v]
(u/map-apply-defaults v defaults))
mp)))
:boards-defaults)))
(defn get-some-config
"Attempts to get config somehow,
first from [custom-file], if it's nil,
then from ./config.edn file.
If is neither exists, default internal one is used."
[custom-file]
(config-fill-board-defaults
;; TODO: There has to be try/catch for when file is invalid edn
;; This is gonna be done when config validation comes in Beta 2
(let [file-to-try (u/nil?-else custom-file
"./config.edn")]
(u/when-else (load-config file-to-try)
CONFIG-DEFAULT))))
(defn -main (defn -main
"Entry point, starts webserver" "Entry point, starts webserver"
[& args] [& args]
@ -54,41 +110,36 @@
(println "Error: " err) (println "Error: " err)
(System/exit 1)) (System/exit 1))
(when (get options :version) (when (get options :version)
(println "RSS Thread Watcher " conf/VERSION " Licensed under AGPL-3.0-only") (println "RSS Thread Watcher " VERSION " Licensed under AGPL-3.0-only")
(System/exit 0)) (System/exit 0))
(when (get options :help) (when (get options :help)
(println "RSS Thread Watcher help:\n" (get parsed-args :summary)) (println "RSS Thread Watcher help:\n" (get parsed-args :summary))
(System/exit 0)) (System/exit 0))
(when (get options :print-default-config) (when (get options :print-default-config)
(println ";;Default internal config file from RSS Thread Watcher " conf/VERSION) (println ";;Default internal config file from RSS Thread Watcher " VERSION)
(clojure.pprint/pprint conf/CONFIG-DEFAULT) (clojure.pprint/pprint CONFIG-DEFAULT)
;; In case someone was copying by hand, this might be useful ;; In case someone was copying by hand, this might be useful
(println ";;END of Default internal config file") (println ";;END of Default internal config file")
(System/exit 0)) (System/exit 0))
(let [config (conf/get-some-config (:config options))] (let [config (get-some-config (:config options))]
;; TODO: probably refactor to use separate config.clj file when validation will be added ;; TODO: probably refactor to use separate config.clj file when validation will be added
;; Init the few globals we have ;; Init the few globals we have
;; TODO: this all needs to go in separate function so it doesnt have to duplicated in repl-main (reset! watcher/GLOBAL-CONFIG config)
(reset! conf/GLOBAL-CONFIG config)
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled)))) (reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config)) (reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config))
(clojure.pprint/pprint config) (clojure.pprint/pprint config)
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port config) (jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG-DEFAULT)
:join? true})))) :join? true}))))
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started ;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
(defn repl-main (defn repl-main
"Development entry point" "Development entry point"
[] []
(let [config (conf/get-some-config nil)]
;; TODO: probably refactor to use separate config.clj file when validation will be added
;; Init the few globals we have
(reset! conf/GLOBAL-CONFIG config)
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config)))
(jetty/run-jetty (rp/wrap-params #'feed/http-handler) (jetty/run-jetty (rp/wrap-params #'feed/http-handler)
{:port (:port conf/CONFIG-DEFAULT) {:port (:port CONFIG-DEFAULT)
;; Dont block REPL thread ;; Dont block REPL thread
:join? false})) :join? false}))
;; (repl-main) ;; (repl-main)
;; Single cache update for repl
;; (watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG))

View file

@ -14,13 +14,12 @@
(ns rss-thread-watch.feed-generator (ns rss-thread-watch.feed-generator
"Generates feeds for requests" "Generates feeds for requests"
(:require [ring.util.response :as response] (:require [ring.middleware.params :as rp]
[ring.util.response :as response]
[clj-rss.core :as rss] [clj-rss.core :as rss]
[clojure.string :as s] [clojure.string :as s]
[rss-thread-watch.watcher :as watcher] [rss-thread-watch.watcher :as watcher]
[rss-thread-watch.utils :as ut] [rss-thread-watch.utils :as ut])
[rss-thread-watch.config :as conf]
[rss-thread-watch.filters :as f])
(:gen-class)) (:gen-class))
(def boards-enabled-cache (def boards-enabled-cache
@ -34,146 +33,117 @@
This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)" This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)"
[thread time] [thread time]
(assoc thread :guid (str (:no thread) (assoc thread :guid (str (:no thread)
"-" "-"
time))) time)))
(defn new-guid-paranoid
"Generate unique GUID on EVERY request to the feed.
Usefull for when you really don't want thread to die.
By making new GUID on every request you can be sure that what you're
seeing in your feed reader isn't cached"
[thread]
(new-guid-always thread (System/currentTimeMillis)))
(defn update-only-guid (defn update-only-guid
"Generates new GUID for feed item ONLY if the threads ChoD increased "Generates new GUID for feed item ONLY if the threads ChoD increased
This is done by concating thread-number and it's rounded chod" This is done by concating thread-number and it's rounded chod"
[thread] [thread]
(assoc thread :guid (format "%d-%d-%.2f" (assoc thread :guid (format "%d-%.2f"
(:no thread) (:no thread)
(:last-modified thread)
(:chod thread)))) (:chod thread))))
(defn make-filters
"Creates map of functions and filters from query string.
Return format is: {filter-fun ['words' 'to' 'filter' 'using this function]}"
[query-string known-filter-map]
(let [filterable (select-keys query-string
(keys known-filter-map))]
(ut/fkmap (fn [k v]
{(get known-filter-map k) (ut/vectorize v)})
filterable)))
(defn filter-chod-posts (defn filter-chod-posts
"Return list of all threads with equal or higher ChoD than requested" "Return list of all threads with equal or higher ChoD than requested
[filters chod-treshold repeat? board-cache]
READS FROM GLOBALS: watcher.time-of-cache"
[query-vec chod-treshold repeat? board-cache]
(let [{time-of-generation :time (let [{time-of-generation :time
cache :data} board-cache cache :data} board-cache
guid-fn (case repeat? guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
"paranoid" new-guid-paranoid update-only-guid)
"true" (fn [x] (new-guid-always x time-of-generation))
update-only-guid)
cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold)) cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold))
cache)) cache))
;; So we don't have to search thru everything we have cached ;; So we don't have to search thru everything we have cached
needed-cache-part (subvec cache cache-start-index) needed-cache-part (subvec cache cache-start-index)
actuall-matches (keep (fn [thread] actuall-matches (keep (fn [t]
(some (let [title (:title t)]
(fn [fun] ;; Todo: Man, wouldn't it be cool to know which querry matched the thread?
(when (fun thread (get filters fun)) ;; Would be so much easier for user to figure out why is it showing
thread)) ;; and it would solve the problem of super long titles (or OPs instead of titles)
(keys filters))) (when (some (fn [querry]
(s/includes? (s/lower-case title) (s/lower-case querry)))
query-vec)
t)))
(reverse needed-cache-part))] (reverse needed-cache-part))]
;; Finally generate and append GUIDs ;; Finally generate and append GUIDs
(map guid-fn actuall-matches))) (map guid-fn actuall-matches)))
(defn thread-to-rss-item (defn thread-to-rss-item
"Converts cached thread item to feed item which can be serialized into RSS" "If I wasnt retarded I could have made the cached version look like
[t host] rss item already but what can you do. I'll refactor I promise, I just need this done ASAP" ;Todo: do what the docstring says
(let [link-url (s/replace host "{threadnum}" (str (:no t)))] [t] ;TODO: oh Luna the hardcodes ;;RESUME
{:title (format "%.2f%% - %s" (:chod t) (:title t)) (let [link-url (str "https://boards.4chan.org/mlp/thread/" (:no t))] ; jesus, well I said only /mlp/ is supported now so fuck it
{:title (format "%.2f%% - %s" (:chod t) (:title t)) ;TODO: Generate link from the target somehow, or just include it from API response
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html ;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t)) :description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
:link link-url :link link-url
:guid (:guid t)})) :guid (:guid t)}))
(defn generate-feed (defn generate-feed
"Generates feed from matching items "Generates feed from matching items"
[query-vec chod-treshold repeat? cache]
READS FROM GLOBALS: (let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
rss-thread-watch.config/VERSION head {:title "RSS Thread watcher v0.4" ;TODO: hardcoded string here, remake to reference to config.clj
rss-thread-watch.config/GLOBAL_CONFIG" :link "https://tools.treebrary.org/thread-watcher/feed.xml"
[filters chod-treshold repeat? cache board-config self-link] :feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
(let [items (filter-chod-posts filters chod-treshold repeat? cache)
head {:title (str "RSS Thread watcher v" conf/VERSION)
;; :link is the homepage of the channel
:link (get @conf/GLOBAL-CONFIG :homepage)
;; :feed-url is where you can get new items, must much the url this is served at
:feed-url self-link
:description "RSS based thread watcher"} :description "RSS based thread watcher"}
body (map #(thread-to-rss-item body (map thread-to-rss-item items)]
%1
(get board-config :host)) items)]
(rss/channel-xml head body))) (rss/channel-xml head body)))
(defn http-handler (defn http-handler
"Handles HTTP requests, returns generated feed "Handles HTTP requests, returns generated feed
READS FROM GLOBALS: READS FROM GLOBALS:
rss-thread-watch.watcher/chod-threads-cache rss-thread-watch.watcher.chod-threads-cache
rss-thread-watch.config/GLOBAL-CONFIG" rss-thread-watch.watcher.GLOBAL-CONFIG" ;TODO: Update if it really reads from there anymore
[rqst] [rqst]
(try (let [served-filename (get @conf/GLOBAL-CONFIG :served-filename) (try (let [{{chod "chod"
{{chod "chod" board "board" :or {chod "94"
board "board" board (get @watcher/GLOBAL-CONFIG :default-board)}
repeat? "repeat" :or {chod "94"
board (get @conf/GLOBAL-CONFIG :default-board)
repeat? false}
:as prms} :params :as prms} :params
uri :uri uri :uri} rqst
query :query-string qrs (prms "q")
scheme :scheme queries (if (vector? qrs) qrs [qrs]) ; to always return vector
server-name :server-name} rqst repeat? (prms "repeat")
filters (make-filters prms f/known-filters) real-chod (if-let [ch (or (and (vector? chod)
;; BUG if local fileserver not running -> FileNotFound exception is thrown and it fucks up the feed generation (first chod))
;; Should be handled because wrong config and thus url generation could do the same chod)]
self-uri (str (s/replace-first scheme ":" "") ; (try ;If we can't parse number from chod, use default 94
"://" server-name uri "?" query) (if (or (vector? chod)
board-config (get-in @conf/GLOBAL-CONFIG [:boards-enabled board]) (<= (Integer/parseInt chod) 60)) ; Never accept chod lower than 60 TODO: don't hardcode this
real-chod (try (max (Integer/parseInt (or (and (vector? chod) 60 (Integer/parseInt chod))
(first chod)) (catch Exception e
chod)) 60) ;HARDCODED CHoD 94)))
(catch Exception _
(get board-config :default-chod)))
cache @watcher/chod-threads-cache] cache @watcher/chod-threads-cache]
(println "\n\nRCVD: " rqst) (println "\n\nRCVD: " rqst)
;; (println rqst) ;; (println rqst)
;; ====== Errors ===== ;; ====== Errors =====
;; Something other than $served-filename requested ;; Something other than feed.xml requested
(when-not (s/ends-with? uri served-filename) (when-not (s/ends-with? uri "feed.xml")
(throw (ex-info "404" {:status 404 (throw (ex-info "404" {:status 404
:header {"Content-Type" "text/plain"} :header {"Content-Type" "text/plain"}
:body (str "404 This server has nothing but " served-filename)}))) :body "404 This server has nothing but /feed.xml"})))
(when-not (contains? @boards-enabled-cache board) (when-not (contains? @boards-enabled-cache board)
(throw (ex-info "403" {:status 403 (throw (ex-info "403" {:status 403
:header {"Content-Type" "text/plain"} :header {"Content-Type" "text/plain"}
:body (get @conf/GLOBAL-CONFIG :board-disabled-message)}))) :body (get @watcher/GLOBAL-CONFIG :board-disabled-message)})))
;; No url params -> we redirect to documentation about params ;; No url params -> we redirect to documentation about params
(when (empty? prms) (when (empty? prms)
(throw (ex-info "302" (throw (ex-info "302"
(response/redirect (get @conf/GLOBAL-CONFIG :homepage))))) (response/redirect "https://git.treebrary.org/Treebrary.org/rss-thread-watcher#headline-4"))))
;; No querry specified - don't know what to search for ;; No querry specified - don't know what to search for
(when-not (some f/known-filter-set (keys prms)) (when-not (prms "q")
(throw (ex-info "400" {:status 400 (throw (ex-info "400" {:status 400
:header {"Content-Type" "text/plain"} :header {"Content-Type" "text/plain"}
:body (str "400 You MUST specify query with one OR more'q=searchTerm' (or 'Q=SeARChteRm' for case sensitive) url parameter(s)\n\n\n" :body (str "400 You MUST specify query with one OR more'q=searchTerm' url parameter(s)\n\n\n"
"Exmple: '" served-filename "?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'" "Exmple: '/feed.xml?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'"
" in their title that are about to die.")}))) " in their title that are about to die.")})))
;; Whether cache has been generated yet ;; Whether cache has been generated yet
(when (empty? cache) (when (empty? cache)
(throw (ex-info "503" {:status 503 (throw (ex-info "503" {:status 503
:header {"Content-Type" "text/plain"} :header {"Content-Type" "text/plain"}
@ -184,7 +154,7 @@
;; There shouldn't be any problems with this mime type but if there are ;; There shouldn't be any problems with this mime type but if there are
;; replace with "text/xml", or even better, get RSS reader that is not utter shit ;; replace with "text/xml", or even better, get RSS reader that is not utter shit
:header {"Content-Type" "application/rss+xml"} :header {"Content-Type" "application/rss+xml"}
:body (generate-feed filters real-chod repeat? (watcher/get-thread-data board @conf/GLOBAL-CONFIG) board-config self-uri)}) :body (generate-feed queries real-chod repeat? (watcher/get-thread-data board @watcher/GLOBAL-CONFIG))})
(catch Exception e (catch Exception e
;; Ex-info has been crafted to match HTTP response body so we can send it ;; Ex-info has been crafted to match HTTP response body so we can send it
(if-let [caught (ex-data e)] (if-let [caught (ex-data e)]

View file

@ -1,37 +0,0 @@
;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
;; the Free Software Foundation, version 3 of the License.
;;
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU Affero General Public License for more details.
;;
;; You should have received a copy of the GNU Affero General Public License
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.filters
"Functions filtering posts"
(:require [clojure.string :as cs])
(:gen-class))
(defn case-sensitive-filter
"Returns true if string [s] is matched by any query. It's case insensitive"
[{:keys [title]} queries]
(some (fn [querry]
(cs/includes? title querry))
queries))
(defn case-insensitive-filter
"Returns true if string [s] is case-matched by query"
[{:keys [title]} queries]
(case-sensitive-filter {:title (cs/lower-case title)} (map cs/lower-case queries)))
(def known-filters
{"Q" case-sensitive-filter
"q" case-insensitive-filter})
(def known-filter-set (set (keys known-filters)))

View file

@ -47,11 +47,6 @@
~x ~x
result#))) result#)))
(defmacro vectorize
"If arg is not a vector, put into vector, otherwise return it"
[v]
(if (vector? v) v [v]))
;; ===== Generic functions ==== ;; ===== Generic functions ====
(defn indices (defn indices
@ -74,28 +69,15 @@
{k (map-apply-defaults conf-val default-val)} {k (map-apply-defaults conf-val default-val)}
{k (nil?-else conf-val default-val)}))))) {k (nil?-else conf-val default-val)})))))
;; This is a shitty version of reduce-kv
(defn fmap (defn fmap
"Applies function [f] to every key and value in map [m] "Applies function [f] to every key and value in map [m]
Function signature should be (f [key value]). Function signature should be (f [key value])."
Key stays unchanged"
[f m] [f m]
(into (into
(empty m) (empty m)
(for [[key val] m] (for [[key val] m]
[key (f key val)]))) [key (f key val)])))
(defn fkmap
;; I am horrible with docstrings, I don't deny that
"Applies function [f] to every key and value in map [m]
Function signature should be (f [key value]).
Unlike fmap, you can change key too, so return both {key value} in map"
[f m]
(into
(empty m)
(for [[key val] m]
(f key val))))
(defn expand-home (defn expand-home
"Expands ~ to home directory" "Expands ~ to home directory"
;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for ;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for

View file

@ -18,6 +18,12 @@
[clojure.data.json :as js]) [clojure.data.json :as js])
(:gen-class)) (:gen-class))
(def GLOBAL-CONFIG
"Global config with defaults for missing entires"
;; I know globals are ew in Clojure but I don't know any
;; better way of doing this
(atom nil))
(def chod-threads-cache (def chod-threads-cache
"Cached map of threads that have CHanceOfDeath > configured" "Cached map of threads that have CHanceOfDeath > configured"
(atom {})) (atom {}))
@ -32,7 +38,7 @@
(defn process-page (defn process-page
"Procesess every thread in page, leaving only relevant information "Procesess every thread in page, leaving only relevant information
(:title or :com, :no :chod :last_modified)" (title no chod)"
([threads-to-index threads-total starting-index] (process-page threads-to-index threads-total starting-index (transient []))) ([threads-to-index threads-total starting-index] (process-page threads-to-index threads-total starting-index (transient [])))
([remaining-threads threads-total index ret] ([remaining-threads threads-total index ret]
(if (empty? remaining-threads) (if (empty? remaining-threads)
@ -45,12 +51,11 @@
(:com thread) ;we use body if thread has it (:com thread) ;we use body if thread has it
"") ;Thread has neither, this prevents null pointer "") ;Thread has neither, this prevents null pointer
:no (:no thread) :no (:no thread)
:chod (* 100 (float (/ index threads-total))) :chod (* 100 (float (/ index threads-total)))}))))))
:last-modified (:last_modified thread)}))))))
(defn build-cache (defn build-cache
"Build cache of near-death threads so the values don't have to be recalculated on each request." "Build cache of near-death threads so the values don't have to be recalculated on each request."
[pages-to-index threads-per-page threads-total] [pages-to-index pages-total threads-per-page threads-total]
{:time (System/currentTimeMillis) {:time (System/currentTimeMillis)
:data (vec (flatten (map (fn [single-page] :data (vec (flatten (map (fn [single-page]
;; We have to (dec page-number) bcs otherwise we would get the total number of threads ;; We have to (dec page-number) bcs otherwise we would get the total number of threads
@ -64,9 +69,8 @@
[url] - Url to download data from [url] - Url to download data from
[board] - Board to assign cached data to, it's existence is NOT checked here [board] - Board to assign cached data to, it's existence is NOT checked here
[starting-page] - From which page consider threads to be fit for near-death cache [starting-page] - From which page consider threads to be fit for near-death cache
Returns :data part of [board] cache THIS FUNCTION WRITES TO chod-threads-cache
THIS FUNCTION WRITES TO: Returns :data part of [board] cache"
rss-thread-watch.watcher/chod-threads-cache"
[url board starting-page] [url board starting-page]
;; Todo: surround with try so we can timeout, 40x and other stuff ;; Todo: surround with try so we can timeout, 40x and other stuff
(let [catalog (with-open [readr (io/reader url)] (let [catalog (with-open [readr (io/reader url)]
@ -83,7 +87,7 @@
;; This will return nill and that fuck everything up ;; This will return nill and that fuck everything up
(println "Refreshed cache for " board) (println "Refreshed cache for " board)
(reset! (get @chod-threads-cache board) (reset! (get @chod-threads-cache board)
(build-cache to-index threads-per-page threads-total)))) (build-cache to-index pages-total threads-per-page threads-total))))
(defn board-enabled? (defn board-enabled?
"Checks whether board is enabled in config" "Checks whether board is enabled in config"
@ -103,13 +107,13 @@
MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY" MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY"
[board config] [board config]
(let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate))) (let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate)))
board-catalog-url (get-in config `(:boards-enabled ~board :target)) {data :data
{time-downloaded :time time-downloaded :time
:or {time-downloaded 0} :or {time-downloaded 0}
:as board-atom } @(get @chod-threads-cache board) :as board-atom } @(get @chod-threads-cache board)
;; TODO: This also makes it implictly lazy-load -> if disabled make the check here ;; TODO: This also makes it implictly lazy-load -> if disabled make the check here
time-to-update? (or (nil? board-atom) time-to-update? (or (nil? board-atom)
(> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))] (> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))]
(if time-to-update? (if time-to-update?
(update-board-cache! board-catalog-url board (get-in config [:boards-enabled board :starting-page])) (update-board-cache! (get-board-url board config) board (get-in config [:boards-enabled board :starting-page]))
@(get @chod-threads-cache board)))) @(get @chod-threads-cache board))))