Compare commits

..

No commits in common. "stable" and "0.4.2Beta1BugFix-REAL" have entirely different histories.

9 changed files with 162 additions and 288 deletions

View file

@ -12,11 +12,11 @@ Get notifications from your feed reader when your favourite thread is about to d
4) Profit! RSS feed will include only the threads matching your querry so every notification your feed reader will send means your 4) Profit! RSS feed will include only the threads matching your querry so every notification your feed reader will send means your
watched thread is about to die watched thread is about to die
*NOTE THAT THIS IS AN BETA RELEASE, IF YOUR THREAD DIES BECAUSE OF RSS-WATCHER MALLFUNCTION DO NOT BLAME ME* pls *NOTE THAT THIS IS AN ALPHA RELEASE, IF YOUR THREAD DIES BECAUSE OF RSS-WATCHER MALLFUNCTION DO NOT BLAME ME* pls
** Getting custom URL ** Getting custom URL
URL without any params (just ~/feed~) won't work. You must specify at least one ~q~ or ~Q~. See bellow. URL without any params (just ~/feed.xml~) won't work. You must specify at least one ~q~. See bellow.
*** Crafting URL by hand *** Crafting URL by hand
@ -24,14 +24,13 @@ Right now there is no automated way to generate your feed url but making one by
**** URL parameters **** URL parameters
Please note that default values may vary depending on which instance/host/board you use, these are the defaults that come with Please note that default values may vary depending on which host you use, these are the defaults that come with this software but
this software but anyone running instance of RSS thread watcher can change them anyone running instance of RSS thread watcher can change them
| Param name | Values [default] | Can have multiple? | Mandatory? | Short description | | Param name | Values [default] | Can have multiple? | Mandatory? | Short description |
|------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------| |------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------|
| board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work | | board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work |
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet | | q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet |
| Q | nil | Yes | No if ~q~ is present | This string is used to filter threads according to their titles, but is CaseSensitive |
| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod | | chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod |
| repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod | | repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod |
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) | | recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
@ -41,14 +40,14 @@ this software but anyone running instance of RSS thread watcher can change them
Standart rules of URLs apply, if you know how to pass params in URL to any website, you don't even have to read this Standart rules of URLs apply, if you know how to pass params in URL to any website, you don't even have to read this
- Open some text editor - Open some text editor
- Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed?~ (you can use plain HTTP if you want to) - Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed.xml?~ (you can use plain HTTP if you want to)
- Now you can append any of the supported parameters (which you can find in the above table): - Now you can append any of the supported parameters (which you can find in the above table):
- For example if we want to be informed about threads with "cute" in their title - For example if we want to be informed about threads with "cute" in their title
- ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed?q=cute~ - ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute~
- If you want more than one param, separate with ~&~, for example: - If you want more than one param, separate with ~&~, for example:
- ~q=cute~ and ~q=pretty~ would be ~https://tools.treebrary.org/thread-watcher/feed?q=cute&q=pretty~ - ~q=cute~ and ~q=pretty~ would be ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty~
- Same is true for when you also want to specify ChoD - Same is true for when you also want to specify ChoD
- ~https://tools.treebrary.org/thread-watcher/feed?q=cute&q=pretty&chod=98~ - ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty&chod=98~
- This will only notify you about threads that: - This will only notify you about threads that:
- Have ~cute~ or ~pretty~ in their title - Have ~cute~ or ~pretty~ in their title
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off) - Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
@ -81,8 +80,6 @@ See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues?q=&type=
- [ ] Support async responses - [ ] Support async responses
- [ ] Graal VM support for native compilation - [ ] Graal VM support for native compilation
For more up to date and complete list of features, check [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/projects][open projects]].
** Self hosting ** Self hosting
As of first Beta release, self hosting is supported, please refer to [[file:res/ExampleConfig-documented.edn][documented example config]] for infomration on configuration As of first Beta release, self hosting is supported, please refer to [[file:res/ExampleConfig-documented.edn][documented example config]] for infomration on configuration

View file

@ -1,4 +1,4 @@
(defproject rss-thread-watch "0.4.9-SNAPSHOT" (defproject rss-thread-watch "0.4.2-SNAPSHOT"
:description "RSS based thread watcher" :description "RSS based thread watcher"
:url "http://example.com/FIXME" :url "http://example.com/FIXME"
:license {:name "AGPL-3.0-only" :license {:name "AGPL-3.0-only"

View file

@ -3,11 +3,7 @@
;; Message displayed when requested board is not enabled ;; Message displayed when requested board is not enabled
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]" :board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]"
;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED ;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED
;; The watcher feed will be served by this url, everything else will be 404
:served-filename "/feed"
;; This is homepage for your feed, it should probably redirect somewhere where you mention
;; What things you have enabled and where to find full docs
:homepage "https://git.treebrary.org/Treebrary.org/rss-thread-watcher"
;; This map defines default values for all enabled boards, if you wish for some board ;; This map defines default values for all enabled boards, if you wish for some board
;; to use different values, specify them bellow in :borads-enabled ;; to use different values, specify them bellow in :borads-enabled
:boards-defaults { :boards-defaults {
@ -18,12 +14,14 @@
:starting-page 7 :starting-page 7
;; Default ChOD to use if none is specified by the user ;; Default ChOD to use if none is specified by the user
:default-chod 94 :default-chod 94
;; This is target for Catalog API requests ;; If you want to do some preprocessing beforehand, you can override
;; {board} will be substitued for board ;; target URL for the board, but the response must be same the 4chan API would return
:target "https://api.4chan.org/{board}/catalog.json" ;; /$board/catalog.json will be appended to this link
;; This is where threads actually reside if different from :target ;; This is target for API requests
;; you can use {board} and {threadnum} for substitutions :target "https://api.4chan.org"
:host "https://boards.4chan.org/{board}/thread/{threadnum}" ;; This host that has the actual threads, /board/thread-no will be appeneded
;; to this
:host "https://boards.4chan.org"
;; Commented parts bellow are still unimplemented ;; Commented parts bellow are still unimplemented
;; ------ ;; ------
;; Only download catalog when someone requests feed and cache is old ;; Only download catalog when someone requests feed and cache is old

View file

@ -1,108 +0,0 @@
;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
;; the Free Software Foundation, version 3 of the License.
;;
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU Affero General Public License for more details.
;;
;; You should have received a copy of the GNU Affero General Public License
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.config
"Functions for working with configuration"
(:require [clojure.java.io :as io]
[clojure.edn :as edn]
[clojure.string :as s]
[rss-thread-watch.utils :as u])
(:gen-class))
;; Verification TODO: check if all required keys are included so we don't get nils
(def VERSION "0.4.9")
(def GLOBAL-CONFIG
"Global config with defaults for missing entires"
;; I know globals are ew in Clojure but I don't know any
;; better way of doing this
(atom nil))
;; Internal default config
(def CONFIG-DEFAULT
"Internal default config"
{:port 6969
:default-board "/mlp/"
:enable-board-listing true
:served-filename "/feed"
:homepage "https://git.treebrary.org/Treebrary.org/rss-thread-watcher"
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
:boards-defaults {:refresh-rate 300
:starting-page 7
:default-chod 94
:target "https://api.4chan.org/{board}/catalog.json"
:host "https://boards.4chan.org/{board}/thread/{threadnum}"
:lazy-load true}
:boards-enabled {"/mlp/" {:lazy-load false}
"/g/" {:starting-page 7}
"/po/" {:starting-page 8
:refresh-rate 86400}
"/p/" {:starting-page 8
:refresh-rate 1800}}})
(defn load-config
"Attempts to load config from file [f].
Returns loaded config map or nil if failed"
[f]
(let [fl (io/as-file f)]
(when (.exists fl)
(with-open [r (io/reader fl)]
(edn/read (java.io.PushbackReader. r))))))
(defn config-url-expand
"Expands substitution in :target and :host fields"
[filled-config]
(let [boards (get filled-config :boards-enabled)
selecting '(:target :host)
pattern "{board}"]
(assoc filled-config
:boards-enabled
(u/fmap (fn [board confs]
(->> (select-keys confs selecting)
(u/fmap (fn [_ v]
(s/replace v pattern (s/replace board "/" ""))))
(merge confs)))
boards))))
(defn config-fill-board-defaults
;; TODO: must have check that if board is default, it's enabled, if it's not, give some big fat warning
;; that users must always specify board, maybe change the error?
"Fills every enabled board with default config values"
[config]
(let [defaults (:boards-defaults config)]
(as-> config conf
(update-in conf
'(:boards-enabled)
(fn [mp]
(u/fmap (fn [k v]
(assoc (u/map-apply-defaults v defaults) :name k))
mp)))
(dissoc conf :boards-defaults)
(config-url-expand conf))))
(defn get-some-config
"Attempts to get config somehow,
first from [custom-file], if it's nil,
then from ./config.edn file.
If is neither exists, default internal one is used."
[custom-file]
(config-fill-board-defaults
;; TODO: There has to be try/catch for when file is invalid edn
;; This is gonna be done when config validation comes in Beta 2
(let [file-to-try (u/nil?-else custom-file
"./config.edn")]
(u/when-else (load-config file-to-try)
CONFIG-DEFAULT))))

View file

@ -13,15 +13,36 @@
;; along with this program. If not, see <https://www.gnu.org/licenses/>. ;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.core (ns rss-thread-watch.core
(:require [clojure.tools.cli :refer [parse-opts]] (:require [clojure.java.io :as io]
[clojure.edn :as edn]
[clojure.tools.cli :refer [parse-opts]]
[ring.adapter.jetty :as jetty] [ring.adapter.jetty :as jetty]
[ring.middleware.params :as rp] [ring.middleware.params :as rp]
[rss-thread-watch.watcher :as watcher] [rss-thread-watch.watcher :as watcher]
[rss-thread-watch.feed-generator :as feed] [rss-thread-watch.feed-generator :as feed]
[rss-thread-watch.utils :as u] [rss-thread-watch.utils :as u])
[rss-thread-watch.config :as conf])
(:gen-class)) (:gen-class))
(def VERSION "0.4.2")
;; Internal default config
(def CONFIG-DEFAULT
"Internal default config"
{:port 6969
:default-board "/mlp/"
:enable-board-listing true
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
:boards-defaults {:refresh-rate 300
:starting-page 7
:default-chod 94
:target "https://api.4chan.org"
:lazy-load true}
:boards-enabled {"/mlp/" {:lazy-load false}
"/g/" {:starting-page 7}
"/po/" {:starting-page 8
:refresh-rate 86400}
"/p/" {:starting-page 8
:refresh-rate 1800}}})
(def cli-options (def cli-options
"Configuration defining program arguments for cli.tools" "Configuration defining program arguments for cli.tools"
@ -45,6 +66,41 @@
(println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes")))) (println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes"))))
(Thread/sleep ms))))) (Thread/sleep ms)))))
(defn load-config
"Attempts to load config from file [f].
Returns loaded config map or nil if failed"
[f]
(let [fl (io/as-file f)]
(when (.exists fl)
(with-open [r (io/reader fl)]
(edn/read (java.io.PushbackReader. r))))))
(defn config-fill-board-defaults
"Fills every enabled board with default config values"
[config]
(let [defaults (:boards-defaults config)]
(dissoc (update-in config
'(:boards-enabled)
(fn [mp]
(u/fmap (fn [k v]
(assoc (u/map-apply-defaults v defaults) :name k))
mp)))
:boards-defaults)))
(defn get-some-config
"Attempts to get config somehow,
first from [custom-file], if it's nil,
then from ./config.edn file.
If is neither exists, default internal one is used."
[custom-file]
(config-fill-board-defaults
;; TODO: There has to be try/catch for when file is invalid edn
;; This is gonna be done when config validation comes in Beta 2
(let [file-to-try (u/nil?-else custom-file
"./config.edn")]
(u/when-else (load-config file-to-try)
CONFIG-DEFAULT))))
(defn -main (defn -main
"Entry point, starts webserver" "Entry point, starts webserver"
[& args] [& args]
@ -54,41 +110,36 @@
(println "Error: " err) (println "Error: " err)
(System/exit 1)) (System/exit 1))
(when (get options :version) (when (get options :version)
(println "RSS Thread Watcher " conf/VERSION " Licensed under AGPL-3.0-only") (println "RSS Thread Watcher " VERSION " Licensed under AGPL-3.0-only")
(System/exit 0)) (System/exit 0))
(when (get options :help) (when (get options :help)
(println "RSS Thread Watcher help:\n" (get parsed-args :summary)) (println "RSS Thread Watcher help:\n" (get parsed-args :summary))
(System/exit 0)) (System/exit 0))
(when (get options :print-default-config) (when (get options :print-default-config)
(println ";;Default internal config file from RSS Thread Watcher " conf/VERSION) (println ";;Default internal config file from RSS Thread Watcher " VERSION)
(clojure.pprint/pprint conf/CONFIG-DEFAULT) (clojure.pprint/pprint CONFIG-DEFAULT)
;; In case someone was copying by hand, this might be useful ;; In case someone was copying by hand, this might be useful
(println ";;END of Default internal config file") (println ";;END of Default internal config file")
(System/exit 0)) (System/exit 0))
(let [config (conf/get-some-config (:config options))] (let [config (get-some-config (:config options))]
;; TODO: probably refactor to use separate config.clj file when validation will be added ;; TODO: probably refactor to use separate config.clj file when validation will be added
;; Init the few globals we have ;; Init the few globals we have
;; TODO: this all needs to go in separate function so it doesnt have to duplicated in repl-main (reset! watcher/GLOBAL-CONFIG config)
(reset! conf/GLOBAL-CONFIG config)
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled)))) (reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config)) (reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config))
(clojure.pprint/pprint config) (clojure.pprint/pprint config)
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port config) (jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG-DEFAULT)
:join? true})))) :join? true}))))
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started ;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
(defn repl-main (defn repl-main
"Development entry point" "Development entry point"
[] []
(let [config (conf/get-some-config nil)]
;; TODO: probably refactor to use separate config.clj file when validation will be added
;; Init the few globals we have
(reset! conf/GLOBAL-CONFIG config)
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config)))
(jetty/run-jetty (rp/wrap-params #'feed/http-handler) (jetty/run-jetty (rp/wrap-params #'feed/http-handler)
{:port (:port conf/CONFIG-DEFAULT) {:port (:port CONFIG-DEFAULT)
;; Dont block REPL thread ;; Dont block REPL thread
:join? false})) :join? false}))
;; (repl-main) ;; (repl-main)
;; Single cache update for repl
;; (watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG))

View file

@ -14,13 +14,12 @@
(ns rss-thread-watch.feed-generator (ns rss-thread-watch.feed-generator
"Generates feeds for requests" "Generates feeds for requests"
(:require [ring.util.response :as response] (:require [ring.middleware.params :as rp]
[ring.util.response :as response]
[clj-rss.core :as rss] [clj-rss.core :as rss]
[clojure.string :as s] [clojure.string :as s]
[rss-thread-watch.watcher :as watcher] [rss-thread-watch.watcher :as watcher]
[rss-thread-watch.utils :as ut] [rss-thread-watch.utils :as ut])
[rss-thread-watch.config :as conf]
[rss-thread-watch.filters :as f])
(:gen-class)) (:gen-class))
(def boards-enabled-cache (def boards-enabled-cache
@ -34,8 +33,8 @@
This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)" This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)"
[thread time] [thread time]
(assoc thread :guid (str (:no thread) (assoc thread :guid (str (:no thread)
"-" "-"
time))) time)))
(defn new-guid-paranoid (defn new-guid-paranoid
"Generate unique GUID on EVERY request to the feed. "Generate unique GUID on EVERY request to the feed.
@ -51,24 +50,16 @@
This is done by concating thread-number and it's rounded chod" This is done by concating thread-number and it's rounded chod"
[thread] [thread]
(assoc thread :guid (format "%d-%d-%.2f" (assoc thread :guid (format "%d-%.2f"
(:no thread) (:no thread)
(:last-modified thread)
(:chod thread)))) (:chod thread))))
(defn make-filters
"Creates map of functions and filters from query string.
Return format is: {filter-fun ['words' 'to' 'filter' 'using this function]}"
[query-string known-filter-map]
(let [filterable (select-keys query-string
(keys known-filter-map))]
(ut/fkmap (fn [k v]
{(get known-filter-map k) (ut/vectorize v)})
filterable)))
(defn filter-chod-posts (defn filter-chod-posts
"Return list of all threads with equal or higher ChoD than requested" "Return list of all threads with equal or higher ChoD than requested
[filters chod-treshold repeat? board-cache]
READS FROM GLOBALS: watcher.time-of-cache"
[query-vec chod-treshold repeat? board-cache]
(let [{time-of-generation :time (let [{time-of-generation :time
cache :data} board-cache cache :data} board-cache
guid-fn (case repeat? guid-fn (case repeat?
@ -76,104 +67,98 @@
"true" (fn [x] (new-guid-always x time-of-generation)) "true" (fn [x] (new-guid-always x time-of-generation))
update-only-guid) update-only-guid)
cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold)) cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold))
cache)) cache))
;; So we don't have to search thru everything we have cached ;; So we don't have to search thru everything we have cached
needed-cache-part (subvec cache cache-start-index) needed-cache-part (subvec cache cache-start-index)
actuall-matches (keep (fn [thread] actuall-matches (keep (fn [t]
(some (let [title (:title t)]
(fn [fun] ;; Todo: Man, wouldn't it be cool to know which querry matched the thread?
(when (fun thread (get filters fun)) ;; Would be so much easier for user to figure out why is it showing
thread)) ;; and it would solve the problem of super long titles (or OPs instead of titles)
(keys filters))) (when (some (fn [querry]
(s/includes? (s/lower-case title) (s/lower-case querry)))
query-vec)
t)))
(reverse needed-cache-part))] (reverse needed-cache-part))]
;; Finally generate and append GUIDs ;; Finally generate and append GUIDs
(map guid-fn actuall-matches))) (map guid-fn actuall-matches)))
(defn thread-to-rss-item (defn thread-to-rss-item
"Converts cached thread item to feed item which can be serialized into RSS" "Converts cached thread item to feed item which can be serialized into RSS"
[t host] [t host board]
(let [link-url (s/replace host "{threadnum}" (str (:no t)))] (let [link-url (str host board (:no t))]
{:title (format "%.2f%% - %s" (:chod t) (:title t)) {:title (format "%.2f%% - %s" (:chod t) (:title t)) ;TODO: Generate link from the target somehow, or just include it from API response
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html ;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t)) :description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
:link link-url :link link-url
:guid (:guid t)})) :guid (:guid t)}))
(defn generate-feed (defn generate-feed
"Generates feed from matching items "Generates feed from matching items"
[query-vec chod-treshold repeat? cache board-config]
READS FROM GLOBALS: (let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
rss-thread-watch.config/VERSION head {:title "RSS Thread watcher v0.4.2" ;TODO: hardcoded string here, remake to reference to config.clj
rss-thread-watch.config/GLOBAL_CONFIG" :link "https://tools.treebrary.org/thread-watcher/feed.xml"
[filters chod-treshold repeat? cache board-config self-link] :feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
(let [items (filter-chod-posts filters chod-treshold repeat? cache)
head {:title (str "RSS Thread watcher v" conf/VERSION)
;; :link is the homepage of the channel
:link (get @conf/GLOBAL-CONFIG :homepage)
;; :feed-url is where you can get new items, must much the url this is served at
:feed-url self-link
:description "RSS based thread watcher"} :description "RSS based thread watcher"}
body (map #(thread-to-rss-item body (map #(thread-to-rss-item
%1 %1
(get board-config :host)) items)] (get board-config :host)
(get board-config :name)) items)]
(rss/channel-xml head body))) (rss/channel-xml head body)))
(defn http-handler (defn http-handler
"Handles HTTP requests, returns generated feed "Handles HTTP requests, returns generated feed
READS FROM GLOBALS: READS FROM GLOBALS:
rss-thread-watch.watcher/chod-threads-cache rss-thread-watch.watcher.chod-threads-cache
rss-thread-watch.config/GLOBAL-CONFIG" rss-thread-watch.watcher.GLOBAL-CONFIG" ;TODO: Update if it really reads from there anymore
[rqst] [rqst]
(try (let [served-filename (get @conf/GLOBAL-CONFIG :served-filename) (try (let [{{chod "chod"
{{chod "chod"
board "board" board "board"
repeat? "repeat" :or {chod "94" repeat? "repeat" :or {chod "94"
board (get @conf/GLOBAL-CONFIG :default-board) board (get @watcher/GLOBAL-CONFIG :default-board)
repeat? false} repeat? false}
:as prms} :params :as prms} :params
uri :uri uri :uri} rqst
query :query-string qrs (prms "q")
scheme :scheme queries (if (vector? qrs) qrs [qrs]) ; to always return vector
server-name :server-name} rqst real-chod (if-let [ch (or (and (vector? chod)
filters (make-filters prms f/known-filters) (first chod))
;; BUG if local fileserver not running -> FileNotFound exception is thrown and it fucks up the feed generation chod)]
;; Should be handled because wrong config and thus url generation could do the same (try ;If we can't parse number from chod, use default 94
self-uri (str (s/replace-first scheme ":" "") ; (if (or (vector? chod)
"://" server-name uri "?" query) (<= (Integer/parseInt chod) 60)) ; Never accept chod lower than 60 TODO: don't hardcode this
board-config (get-in @conf/GLOBAL-CONFIG [:boards-enabled board]) 60 (Integer/parseInt chod))
real-chod (try (max (Integer/parseInt (or (and (vector? chod) (catch Exception e
(first chod)) 94)))
chod)) 60) ;HARDCODED CHoD board-config (get-in @watcher/GLOBAL-CONFIG [:boards-enabled board])
(catch Exception _
(get board-config :default-chod)))
cache @watcher/chod-threads-cache] cache @watcher/chod-threads-cache]
(println "\n\nRCVD: " rqst) (println "\n\nRCVD: " rqst)
;; (println rqst) ;; (println rqst)
;; ====== Errors ===== ;; ====== Errors =====
;; Something other than $served-filename requested ;; Something other than feed.xml requested
(when-not (s/ends-with? uri served-filename) (when-not (s/ends-with? uri "feed.xml")
(throw (ex-info "404" {:status 404 (throw (ex-info "404" {:status 404
:header {"Content-Type" "text/plain"} :header {"Content-Type" "text/plain"}
:body (str "404 This server has nothing but " served-filename)}))) :body "404 This server has nothing but /feed.xml"})))
(when-not (contains? @boards-enabled-cache board) (when-not (contains? @boards-enabled-cache board)
(throw (ex-info "403" {:status 403 (throw (ex-info "403" {:status 403
:header {"Content-Type" "text/plain"} :header {"Content-Type" "text/plain"}
:body (get @conf/GLOBAL-CONFIG :board-disabled-message)}))) :body (get @watcher/GLOBAL-CONFIG :board-disabled-message)})))
;; No url params -> we redirect to documentation about params ;; No url params -> we redirect to documentation about params
(when (empty? prms) (when (empty? prms)
(throw (ex-info "302" (throw (ex-info "302"
(response/redirect (get @conf/GLOBAL-CONFIG :homepage))))) (response/redirect "https://git.treebrary.org/Treebrary.org/rss-thread-watcher#headline-4"))))
;; No querry specified - don't know what to search for ;; No querry specified - don't know what to search for
(when-not (some f/known-filter-set (keys prms)) (when-not (prms "q")
(throw (ex-info "400" {:status 400 (throw (ex-info "400" {:status 400
:header {"Content-Type" "text/plain"} :header {"Content-Type" "text/plain"}
:body (str "400 You MUST specify query with one OR more'q=searchTerm' (or 'Q=SeARChteRm' for case sensitive) url parameter(s)\n\n\n" :body (str "400 You MUST specify query with one OR more'q=searchTerm' url parameter(s)\n\n\n"
"Exmple: '" served-filename "?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'" "Exmple: '/feed.xml?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'"
" in their title that are about to die.")}))) " in their title that are about to die.")})))
;; Whether cache has been generated yet ;; Whether cache has been generated yet
(when (empty? cache) (when (empty? cache)
(throw (ex-info "503" {:status 503 (throw (ex-info "503" {:status 503
:header {"Content-Type" "text/plain"} :header {"Content-Type" "text/plain"}
@ -184,7 +169,7 @@
;; There shouldn't be any problems with this mime type but if there are ;; There shouldn't be any problems with this mime type but if there are
;; replace with "text/xml", or even better, get RSS reader that is not utter shit ;; replace with "text/xml", or even better, get RSS reader that is not utter shit
:header {"Content-Type" "application/rss+xml"} :header {"Content-Type" "application/rss+xml"}
:body (generate-feed filters real-chod repeat? (watcher/get-thread-data board @conf/GLOBAL-CONFIG) board-config self-uri)}) :body (generate-feed queries real-chod repeat? (watcher/get-thread-data board @watcher/GLOBAL-CONFIG) board-config)})
(catch Exception e (catch Exception e
;; Ex-info has been crafted to match HTTP response body so we can send it ;; Ex-info has been crafted to match HTTP response body so we can send it
(if-let [caught (ex-data e)] (if-let [caught (ex-data e)]

View file

@ -1,37 +0,0 @@
;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
;; the Free Software Foundation, version 3 of the License.
;;
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU Affero General Public License for more details.
;;
;; You should have received a copy of the GNU Affero General Public License
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.filters
"Functions filtering posts"
(:require [clojure.string :as cs])
(:gen-class))
(defn case-sensitive-filter
"Returns true if string [s] is matched by any query. It's case insensitive"
[{:keys [title]} queries]
(some (fn [querry]
(cs/includes? title querry))
queries))
(defn case-insensitive-filter
"Returns true if string [s] is case-matched by query"
[{:keys [title]} queries]
(case-sensitive-filter {:title (cs/lower-case title)} (map cs/lower-case queries)))
(def known-filters
{"Q" case-sensitive-filter
"q" case-insensitive-filter})
(def known-filter-set (set (keys known-filters)))

View file

@ -47,11 +47,6 @@
~x ~x
result#))) result#)))
(defmacro vectorize
"If arg is not a vector, put into vector, otherwise return it"
[v]
(if (vector? v) v [v]))
;; ===== Generic functions ==== ;; ===== Generic functions ====
(defn indices (defn indices
@ -74,28 +69,15 @@
{k (map-apply-defaults conf-val default-val)} {k (map-apply-defaults conf-val default-val)}
{k (nil?-else conf-val default-val)}))))) {k (nil?-else conf-val default-val)})))))
;; This is a shitty version of reduce-kv
(defn fmap (defn fmap
"Applies function [f] to every key and value in map [m] "Applies function [f] to every key and value in map [m]
Function signature should be (f [key value]). Function signature should be (f [key value])."
Key stays unchanged"
[f m] [f m]
(into (into
(empty m) (empty m)
(for [[key val] m] (for [[key val] m]
[key (f key val)]))) [key (f key val)])))
(defn fkmap
;; I am horrible with docstrings, I don't deny that
"Applies function [f] to every key and value in map [m]
Function signature should be (f [key value]).
Unlike fmap, you can change key too, so return both {key value} in map"
[f m]
(into
(empty m)
(for [[key val] m]
(f key val))))
(defn expand-home (defn expand-home
"Expands ~ to home directory" "Expands ~ to home directory"
;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for ;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for

View file

@ -18,6 +18,12 @@
[clojure.data.json :as js]) [clojure.data.json :as js])
(:gen-class)) (:gen-class))
(def GLOBAL-CONFIG
"Global config with defaults for missing entires"
;; I know globals are ew in Clojure but I don't know any
;; better way of doing this
(atom nil))
(def chod-threads-cache (def chod-threads-cache
"Cached map of threads that have CHanceOfDeath > configured" "Cached map of threads that have CHanceOfDeath > configured"
(atom {})) (atom {}))
@ -32,7 +38,7 @@
(defn process-page (defn process-page
"Procesess every thread in page, leaving only relevant information "Procesess every thread in page, leaving only relevant information
(:title or :com, :no :chod :last_modified)" (title no chod)"
([threads-to-index threads-total starting-index] (process-page threads-to-index threads-total starting-index (transient []))) ([threads-to-index threads-total starting-index] (process-page threads-to-index threads-total starting-index (transient [])))
([remaining-threads threads-total index ret] ([remaining-threads threads-total index ret]
(if (empty? remaining-threads) (if (empty? remaining-threads)
@ -41,16 +47,17 @@
(recur (rest remaining-threads) (recur (rest remaining-threads)
threads-total threads-total
(inc index) (inc index)
;; We have to somehow include URL which is a problem since the catalog does not contain any
;; I of course know how to craft it but the result will be kind of 4chan specific
(conj! ret {:title (or (:sub thread) ;We use thread title if thread has it (conj! ret {:title (or (:sub thread) ;We use thread title if thread has it
(:com thread) ;we use body if thread has it (:com thread) ;we use body if thread has it
"") ;Thread has neither, this prevents null pointer "") ;Thread has neither, this prevents null pointer
:no (:no thread) :no (:no thread)
:chod (* 100 (float (/ index threads-total))) :chod (* 100 (float (/ index threads-total)))}))))))
:last-modified (:last_modified thread)}))))))
(defn build-cache (defn build-cache
"Build cache of near-death threads so the values don't have to be recalculated on each request." "Build cache of near-death threads so the values don't have to be recalculated on each request."
[pages-to-index threads-per-page threads-total] [pages-to-index pages-total threads-per-page threads-total]
{:time (System/currentTimeMillis) {:time (System/currentTimeMillis)
:data (vec (flatten (map (fn [single-page] :data (vec (flatten (map (fn [single-page]
;; We have to (dec page-number) bcs otherwise we would get the total number of threads ;; We have to (dec page-number) bcs otherwise we would get the total number of threads
@ -64,9 +71,8 @@
[url] - Url to download data from [url] - Url to download data from
[board] - Board to assign cached data to, it's existence is NOT checked here [board] - Board to assign cached data to, it's existence is NOT checked here
[starting-page] - From which page consider threads to be fit for near-death cache [starting-page] - From which page consider threads to be fit for near-death cache
Returns :data part of [board] cache THIS FUNCTION WRITES TO chod-threads-cache
THIS FUNCTION WRITES TO: Returns :data part of [board] cache"
rss-thread-watch.watcher/chod-threads-cache"
[url board starting-page] [url board starting-page]
;; Todo: surround with try so we can timeout, 40x and other stuff ;; Todo: surround with try so we can timeout, 40x and other stuff
(let [catalog (with-open [readr (io/reader url)] (let [catalog (with-open [readr (io/reader url)]
@ -83,7 +89,7 @@
;; This will return nill and that fuck everything up ;; This will return nill and that fuck everything up
(println "Refreshed cache for " board) (println "Refreshed cache for " board)
(reset! (get @chod-threads-cache board) (reset! (get @chod-threads-cache board)
(build-cache to-index threads-per-page threads-total)))) (build-cache to-index pages-total threads-per-page threads-total))))
(defn board-enabled? (defn board-enabled?
"Checks whether board is enabled in config" "Checks whether board is enabled in config"
@ -103,13 +109,13 @@
MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY" MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY"
[board config] [board config]
(let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate))) (let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate)))
board-catalog-url (get-in config `(:boards-enabled ~board :target)) {data :data
{time-downloaded :time time-downloaded :time
:or {time-downloaded 0} :or {time-downloaded 0}
:as board-atom } @(get @chod-threads-cache board) :as board-atom } @(get @chod-threads-cache board)
;; TODO: This also makes it implictly lazy-load -> if disabled make the check here ;; TODO: This also makes it implictly lazy-load -> if disabled make the check here
time-to-update? (or (nil? board-atom) time-to-update? (or (nil? board-atom)
(> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))] (> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))]
(if time-to-update? (if time-to-update?
(update-board-cache! board-catalog-url board (get-in config [:boards-enabled board :starting-page])) (update-board-cache! (get-board-url board config) board (get-in config [:boards-enabled board :starting-page]))
@(get @chod-threads-cache board)))) @(get @chod-threads-cache board))))