Compare commits

..

No commits in common. "stable" and "dev" have entirely different histories.
stable ... dev

9 changed files with 164 additions and 317 deletions

View file

@ -12,11 +12,11 @@ Get notifications from your feed reader when your favourite thread is about to d
4) Profit! RSS feed will include only the threads matching your querry so every notification your feed reader will send means your
watched thread is about to die
*NOTE THAT THIS IS AN BETA RELEASE, IF YOUR THREAD DIES BECAUSE OF RSS-WATCHER MALLFUNCTION DO NOT BLAME ME* pls
*NOTE THAT THIS IS AN ALPHA RELEASE, IF YOUR THREAD DIES BECAUSE OF RSS-WATCHER MALLFUNCTION DO NOT BLAME ME* pls
** Getting custom URL
URL without any params (just ~/feed~) won't work. You must specify at least one ~q~ or ~Q~. See bellow.
URL without any params (just ~/feed.xml~) won't work. You must specify at least one ~q~. See bellow.
*** Crafting URL by hand
@ -24,14 +24,13 @@ Right now there is no automated way to generate your feed url but making one by
**** URL parameters
Please note that default values may vary depending on which instance/host/board you use, these are the defaults that come with
this software but anyone running instance of RSS thread watcher can change them
Please note that default values may vary depending on which host you use, these are the defaults that come with this software but
anyone running instance of RSS thread watcher can change them
| Param name | Values [default] | Can have multiple? | Mandatory? | Short description |
|------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------|
| board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work |
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet |
| Q | nil | Yes | No if ~q~ is present | This string is used to filter threads according to their titles, but is CaseSensitive |
| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod |
| repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod |
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
@ -41,14 +40,14 @@ this software but anyone running instance of RSS thread watcher can change them
Standart rules of URLs apply, if you know how to pass params in URL to any website, you don't even have to read this
- Open some text editor
- Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed?~ (you can use plain HTTP if you want to)
- Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed.xml?~ (you can use plain HTTP if you want to)
- Now you can append any of the supported parameters (which you can find in the above table):
- For example if we want to be informed about threads with "cute" in their title
- ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed?q=cute~
- ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute~
- If you want more than one param, separate with ~&~, for example:
- ~q=cute~ and ~q=pretty~ would be ~https://tools.treebrary.org/thread-watcher/feed?q=cute&q=pretty~
- ~q=cute~ and ~q=pretty~ would be ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty~
- Same is true for when you also want to specify ChoD
- ~https://tools.treebrary.org/thread-watcher/feed?q=cute&q=pretty&chod=98~
- ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty&chod=98~
- This will only notify you about threads that:
- Have ~cute~ or ~pretty~ in their title
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
@ -79,9 +78,7 @@ See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues?q=&type=
- [ ] Support notification for thread death
- [X] Support multiple boards at once
- [ ] Support async responses
- [ ] Graal VM support for native compilation
For more up to date and complete list of features, check [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/projects][open projects]].
- [ ] Graal VM support for native configuration
** Self hosting
@ -90,14 +87,8 @@ options.
*** Prebuilt
Download newest release from [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/releases][releases]] and run them like you would any other java executable (example bellow). There are two JARs,
first requires you to have Clojure installed (~rss-thread-watch-[version]~), second one doesn't (the one with ~standalone~ in
name).
~$ java -jar whatEverNameTheReleaseHas.jar~
Default port is ~6969~ so either change it or allow it in your firewall. HTTPS is not supported, so using some reverse proxy like
Nginx could be the easiest solution for that.
Download newest release from [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/releases][releases]] and run them like you would any other java executable, default port is ~6969~
~$ java -jar whatEverNameTheReleaseHas.jar~~
*** From source
Not officially supported, if you'll attempt this, please, use source from release tarball or checkout ~release~ or ~stable~

View file

@ -1,4 +1,4 @@
(defproject rss-thread-watch "0.4.9-SNAPSHOT"
(defproject rss-thread-watch "0.4.0-SNAPSHOT"
:description "RSS based thread watcher"
:url "http://example.com/FIXME"
:license {:name "AGPL-3.0-only"
@ -9,6 +9,6 @@
[clj-rss "0.4.0"]
[org.clojure/data.json "2.4.0"]
[org.clojure/tools.cli "1.1.230"]]
:main rss-thread-watch.core
:main ^:skip-aot rss-thread-watch.core
:target-path "target/%s"
:profiles {:uberjar {:aot :all}})

View file

@ -3,11 +3,7 @@
;; Message displayed when requested board is not enabled
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]"
;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED
;; The watcher feed will be served by this url, everything else will be 404
:served-filename "/feed"
;; This is homepage for your feed, it should probably redirect somewhere where you mention
;; What things you have enabled and where to find full docs
:homepage "https://git.treebrary.org/Treebrary.org/rss-thread-watcher"
;; This map defines default values for all enabled boards, if you wish for some board
;; to use different values, specify them bellow in :borads-enabled
:boards-defaults {
@ -18,12 +14,10 @@
:starting-page 7
;; Default ChOD to use if none is specified by the user
:default-chod 94
;; This is target for Catalog API requests
;; {board} will be substitued for board
:target "https://api.4chan.org/{board}/catalog.json"
;; This is where threads actually reside if different from :target
;; you can use {board} and {threadnum} for substitutions
:host "https://boards.4chan.org/{board}/thread/{threadnum}"
;; If you want to do some preprocessing beforehand, you can override
;; Target URL for the board, but the response must be same the 4chan API would return
;; /$board/catalog.json will be appended to this link
:target "https://api.4chan.org"
;; Commented parts bellow are still unimplemented
;; ------
;; Only download catalog when someone requests feed and cache is old

View file

@ -1,108 +0,0 @@
;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
;; the Free Software Foundation, version 3 of the License.
;;
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU Affero General Public License for more details.
;;
;; You should have received a copy of the GNU Affero General Public License
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.config
"Functions for working with configuration"
(:require [clojure.java.io :as io]
[clojure.edn :as edn]
[clojure.string :as s]
[rss-thread-watch.utils :as u])
(:gen-class))
;; Verification TODO: check if all required keys are included so we don't get nils
(def VERSION "0.4.9")
(def GLOBAL-CONFIG
"Global config with defaults for missing entires"
;; I know globals are ew in Clojure but I don't know any
;; better way of doing this
(atom nil))
;; Internal default config
(def CONFIG-DEFAULT
"Internal default config"
{:port 6969
:default-board "/mlp/"
:enable-board-listing true
:served-filename "/feed"
:homepage "https://git.treebrary.org/Treebrary.org/rss-thread-watcher"
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
:boards-defaults {:refresh-rate 300
:starting-page 7
:default-chod 94
:target "https://api.4chan.org/{board}/catalog.json"
:host "https://boards.4chan.org/{board}/thread/{threadnum}"
:lazy-load true}
:boards-enabled {"/mlp/" {:lazy-load false}
"/g/" {:starting-page 7}
"/po/" {:starting-page 8
:refresh-rate 86400}
"/p/" {:starting-page 8
:refresh-rate 1800}}})
(defn load-config
"Attempts to load config from file [f].
Returns loaded config map or nil if failed"
[f]
(let [fl (io/as-file f)]
(when (.exists fl)
(with-open [r (io/reader fl)]
(edn/read (java.io.PushbackReader. r))))))
(defn config-url-expand
"Expands substitution in :target and :host fields"
[filled-config]
(let [boards (get filled-config :boards-enabled)
selecting '(:target :host)
pattern "{board}"]
(assoc filled-config
:boards-enabled
(u/fmap (fn [board confs]
(->> (select-keys confs selecting)
(u/fmap (fn [_ v]
(s/replace v pattern (s/replace board "/" ""))))
(merge confs)))
boards))))
(defn config-fill-board-defaults
;; TODO: must have check that if board is default, it's enabled, if it's not, give some big fat warning
;; that users must always specify board, maybe change the error?
"Fills every enabled board with default config values"
[config]
(let [defaults (:boards-defaults config)]
(as-> config conf
(update-in conf
'(:boards-enabled)
(fn [mp]
(u/fmap (fn [k v]
(assoc (u/map-apply-defaults v defaults) :name k))
mp)))
(dissoc conf :boards-defaults)
(config-url-expand conf))))
(defn get-some-config
"Attempts to get config somehow,
first from [custom-file], if it's nil,
then from ./config.edn file.
If is neither exists, default internal one is used."
[custom-file]
(config-fill-board-defaults
;; TODO: There has to be try/catch for when file is invalid edn
;; This is gonna be done when config validation comes in Beta 2
(let [file-to-try (u/nil?-else custom-file
"./config.edn")]
(u/when-else (load-config file-to-try)
CONFIG-DEFAULT))))

View file

@ -13,15 +13,36 @@
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.core
(:require [clojure.tools.cli :refer [parse-opts]]
(:require [clojure.java.io :as io]
[clojure.edn :as edn]
[clojure.tools.cli :refer [parse-opts]]
[ring.adapter.jetty :as jetty]
[ring.middleware.params :as rp]
[rss-thread-watch.watcher :as watcher]
[rss-thread-watch.feed-generator :as feed]
[rss-thread-watch.utils :as u]
[rss-thread-watch.config :as conf])
[rss-thread-watch.utils :as u])
(:gen-class))
(def VERSION "0.4.0")
;; Internal default config
(def CONFIG-DEFAULT
"Internal default config"
{:port 6969
:default-board "/mlp/"
:enable-board-listing true
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
:boards-defaults {:refresh-rate 300
:starting-page 7
:default-chod 94
:target "https://api.4chan.org"
:lazy-load true}
:boards-enabled {"/mlp/" {:lazy-load false}
"/g/" {:starting-page 7}
"/po/" {:starting-page 8
:refresh-rate 86400}
"/p/" {:starting-page 8
:refresh-rate 1800}}})
(def cli-options
"Configuration defining program arguments for cli.tools"
@ -45,6 +66,41 @@
(println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes"))))
(Thread/sleep ms)))))
(defn load-config
"Attempts to load config from file [f].
Returns loaded config map or nil if failed"
[f]
(let [fl (io/as-file f)]
(when (.exists fl)
(with-open [r (io/reader fl)]
(edn/read (java.io.PushbackReader. r))))))
(defn config-fill-board-defaults
"Fills every enabled board with default config values"
[config]
(let [defaults (:boards-defaults config)]
(dissoc (update-in config
'(:boards-enabled)
(fn [mp]
(u/fmap (fn [k v]
(u/map-apply-defaults v defaults))
mp)))
:boards-defaults)))
(defn get-some-config
"Attempts to get config somehow,
first from [custom-file], if it's nil,
then from ./config.edn file.
If is neither exists, default internal one is used."
[custom-file]
(config-fill-board-defaults
;; TODO: There has to be try/catch for when file is invalid edn
;; This is gonna be done when config validation comes in Beta 2
(let [file-to-try (u/nil?-else custom-file
"./config.edn")]
(u/when-else (load-config file-to-try)
CONFIG-DEFAULT))))
(defn -main
"Entry point, starts webserver"
[& args]
@ -54,41 +110,36 @@
(println "Error: " err)
(System/exit 1))
(when (get options :version)
(println "RSS Thread Watcher " conf/VERSION " Licensed under AGPL-3.0-only")
(println "RSS Thread Watcher " VERSION " Licensed under AGPL-3.0-only")
(System/exit 0))
(when (get options :help)
(println "RSS Thread Watcher help:\n" (get parsed-args :summary))
(System/exit 0))
(when (get options :print-default-config)
(println ";;Default internal config file from RSS Thread Watcher " conf/VERSION)
(clojure.pprint/pprint conf/CONFIG-DEFAULT)
(println ";;Default internal config file from RSS Thread Watcher " VERSION)
(clojure.pprint/pprint CONFIG-DEFAULT)
;; In case someone was copying by hand, this might be useful
(println ";;END of Default internal config file")
(System/exit 0))
(let [config (conf/get-some-config (:config options))]
(let [config (get-some-config (:config options))]
;; TODO: probably refactor to use separate config.clj file when validation will be added
;; Init the few globals we have
;; TODO: this all needs to go in separate function so it doesnt have to duplicated in repl-main
(reset! conf/GLOBAL-CONFIG config)
(reset! watcher/GLOBAL-CONFIG config)
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config))
(clojure.pprint/pprint config)
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port config)
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG-DEFAULT)
:join? true}))))
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
(defn repl-main
"Development entry point"
[]
(let [config (conf/get-some-config nil)]
;; TODO: probably refactor to use separate config.clj file when validation will be added
;; Init the few globals we have
(reset! conf/GLOBAL-CONFIG config)
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config)))
(jetty/run-jetty (rp/wrap-params #'feed/http-handler)
{:port (:port conf/CONFIG-DEFAULT)
{:port (:port CONFIG-DEFAULT)
;; Dont block REPL thread
:join? false}))
;; (repl-main)
;; Single cache update for repl
;; (watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG))

View file

@ -14,13 +14,12 @@
(ns rss-thread-watch.feed-generator
"Generates feeds for requests"
(:require [ring.util.response :as response]
(:require [ring.middleware.params :as rp]
[ring.util.response :as response]
[clj-rss.core :as rss]
[clojure.string :as s]
[rss-thread-watch.watcher :as watcher]
[rss-thread-watch.utils :as ut]
[rss-thread-watch.config :as conf]
[rss-thread-watch.filters :as f])
[rss-thread-watch.utils :as ut])
(:gen-class))
(def boards-enabled-cache
@ -34,146 +33,117 @@
This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)"
[thread time]
(assoc thread :guid (str (:no thread)
"-"
time)))
(defn new-guid-paranoid
"Generate unique GUID on EVERY request to the feed.
Usefull for when you really don't want thread to die.
By making new GUID on every request you can be sure that what you're
seeing in your feed reader isn't cached"
[thread]
(new-guid-always thread (System/currentTimeMillis)))
"-"
time)))
(defn update-only-guid
"Generates new GUID for feed item ONLY if the threads ChoD increased
This is done by concating thread-number and it's rounded chod"
[thread]
(assoc thread :guid (format "%d-%d-%.2f"
(assoc thread :guid (format "%d-%.2f"
(:no thread)
(:last-modified thread)
(:chod thread))))
(defn make-filters
"Creates map of functions and filters from query string.
Return format is: {filter-fun ['words' 'to' 'filter' 'using this function]}"
[query-string known-filter-map]
(let [filterable (select-keys query-string
(keys known-filter-map))]
(ut/fkmap (fn [k v]
{(get known-filter-map k) (ut/vectorize v)})
filterable)))
(defn filter-chod-posts
"Return list of all threads with equal or higher ChoD than requested"
[filters chod-treshold repeat? board-cache]
"Return list of all threads with equal or higher ChoD than requested
READS FROM GLOBALS: watcher.time-of-cache"
[query-vec chod-treshold repeat? board-cache]
(let [{time-of-generation :time
cache :data} board-cache
guid-fn (case repeat?
"paranoid" new-guid-paranoid
"true" (fn [x] (new-guid-always x time-of-generation))
update-only-guid)
guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
update-only-guid)
cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold))
cache))
cache))
;; So we don't have to search thru everything we have cached
needed-cache-part (subvec cache cache-start-index)
actuall-matches (keep (fn [thread]
(some
(fn [fun]
(when (fun thread (get filters fun))
thread))
(keys filters)))
actuall-matches (keep (fn [t]
(let [title (:title t)]
;; Todo: Man, wouldn't it be cool to know which querry matched the thread?
;; Would be so much easier for user to figure out why is it showing
;; and it would solve the problem of super long titles (or OPs instead of titles)
(when (some (fn [querry]
(s/includes? (s/lower-case title) (s/lower-case querry)))
query-vec)
t)))
(reverse needed-cache-part))]
;; Finally generate and append GUIDs
(map guid-fn actuall-matches)))
(defn thread-to-rss-item
"Converts cached thread item to feed item which can be serialized into RSS"
[t host]
(let [link-url (s/replace host "{threadnum}" (str (:no t)))]
{:title (format "%.2f%% - %s" (:chod t) (:title t))
"If I wasnt retarded I could have made the cached version look like
rss item already but what can you do. I'll refactor I promise, I just need this done ASAP" ;Todo: do what the docstring says
[t] ;TODO: oh Luna the hardcodes ;;RESUME
(let [link-url (str "https://boards.4chan.org/mlp/thread/" (:no t))] ; jesus, well I said only /mlp/ is supported now so fuck it
{:title (format "%.2f%% - %s" (:chod t) (:title t)) ;TODO: Generate link from the target somehow, or just include it from API response
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
:link link-url
:guid (:guid t)}))
(defn generate-feed
"Generates feed from matching items
READS FROM GLOBALS:
rss-thread-watch.config/VERSION
rss-thread-watch.config/GLOBAL_CONFIG"
[filters chod-treshold repeat? cache board-config self-link]
(let [items (filter-chod-posts filters chod-treshold repeat? cache)
head {:title (str "RSS Thread watcher v" conf/VERSION)
;; :link is the homepage of the channel
:link (get @conf/GLOBAL-CONFIG :homepage)
;; :feed-url is where you can get new items, must much the url this is served at
:feed-url self-link
"Generates feed from matching items"
[query-vec chod-treshold repeat? cache]
(let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
head {:title "RSS Thread watcher v0.4" ;TODO: hardcoded string here, remake to reference to config.clj
:link "https://tools.treebrary.org/thread-watcher/feed.xml"
:feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
:description "RSS based thread watcher"}
body (map #(thread-to-rss-item
%1
(get board-config :host)) items)]
body (map thread-to-rss-item items)]
(rss/channel-xml head body)))
(defn http-handler
"Handles HTTP requests, returns generated feed
READS FROM GLOBALS:
rss-thread-watch.watcher/chod-threads-cache
rss-thread-watch.config/GLOBAL-CONFIG"
rss-thread-watch.watcher.chod-threads-cache
rss-thread-watch.watcher.GLOBAL-CONFIG" ;TODO: Update if it really reads from there anymore
[rqst]
(try (let [served-filename (get @conf/GLOBAL-CONFIG :served-filename)
{{chod "chod"
board "board"
repeat? "repeat" :or {chod "94"
board (get @conf/GLOBAL-CONFIG :default-board)
repeat? false}
(try (let [{{chod "chod"
board "board" :or {chod "94"
board (get @watcher/GLOBAL-CONFIG :default-board)}
:as prms} :params
uri :uri
query :query-string
scheme :scheme
server-name :server-name} rqst
filters (make-filters prms f/known-filters)
;; BUG if local fileserver not running -> FileNotFound exception is thrown and it fucks up the feed generation
;; Should be handled because wrong config and thus url generation could do the same
self-uri (str (s/replace-first scheme ":" "") ;
"://" server-name uri "?" query)
board-config (get-in @conf/GLOBAL-CONFIG [:boards-enabled board])
real-chod (try (max (Integer/parseInt (or (and (vector? chod)
(first chod))
chod)) 60) ;HARDCODED CHoD
(catch Exception _
(get board-config :default-chod)))
uri :uri} rqst
qrs (prms "q")
queries (if (vector? qrs) qrs [qrs]) ; to always return vector
repeat? (prms "repeat")
real-chod (if-let [ch (or (and (vector? chod)
(first chod))
chod)]
(try ;If we can't parse number from chod, use default 94
(if (or (vector? chod)
(<= (Integer/parseInt chod) 60)) ; Never accept chod lower than 60 TODO: don't hardcode this
60 (Integer/parseInt chod))
(catch Exception e
94)))
cache @watcher/chod-threads-cache]
(println "\n\nRCVD: " rqst)
;; (println rqst)
;; ====== Errors =====
;; Something other than $served-filename requested
(when-not (s/ends-with? uri served-filename)
;; Something other than feed.xml requested
(when-not (s/ends-with? uri "feed.xml")
(throw (ex-info "404" {:status 404
:header {"Content-Type" "text/plain"}
:body (str "404 This server has nothing but " served-filename)})))
:body "404 This server has nothing but /feed.xml"})))
(when-not (contains? @boards-enabled-cache board)
(throw (ex-info "403" {:status 403
:header {"Content-Type" "text/plain"}
:body (get @conf/GLOBAL-CONFIG :board-disabled-message)})))
:body (get @watcher/GLOBAL-CONFIG :board-disabled-message)})))
;; No url params -> we redirect to documentation about params
(when (empty? prms)
(throw (ex-info "302"
(response/redirect (get @conf/GLOBAL-CONFIG :homepage)))))
(response/redirect "https://git.treebrary.org/Treebrary.org/rss-thread-watcher#headline-4"))))
;; No querry specified - don't know what to search for
(when-not (some f/known-filter-set (keys prms))
(when-not (prms "q")
(throw (ex-info "400" {:status 400
:header {"Content-Type" "text/plain"}
:body (str "400 You MUST specify query with one OR more'q=searchTerm' (or 'Q=SeARChteRm' for case sensitive) url parameter(s)\n\n\n"
"Exmple: '" served-filename "?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'"
:body (str "400 You MUST specify query with one OR more'q=searchTerm' url parameter(s)\n\n\n"
"Exmple: '/feed.xml?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'"
" in their title that are about to die.")})))
;; Whether cache has been generated yet
(when (empty? cache)
(throw (ex-info "503" {:status 503
:header {"Content-Type" "text/plain"}
@ -184,7 +154,7 @@
;; There shouldn't be any problems with this mime type but if there are
;; replace with "text/xml", or even better, get RSS reader that is not utter shit
:header {"Content-Type" "application/rss+xml"}
:body (generate-feed filters real-chod repeat? (watcher/get-thread-data board @conf/GLOBAL-CONFIG) board-config self-uri)})
:body (generate-feed queries real-chod repeat? (watcher/get-thread-data board @watcher/GLOBAL-CONFIG))})
(catch Exception e
;; Ex-info has been crafted to match HTTP response body so we can send it
(if-let [caught (ex-data e)]

View file

@ -1,37 +0,0 @@
;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
;; the Free Software Foundation, version 3 of the License.
;;
;; This program is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU Affero General Public License for more details.
;;
;; You should have received a copy of the GNU Affero General Public License
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
(ns rss-thread-watch.filters
"Functions filtering posts"
(:require [clojure.string :as cs])
(:gen-class))
(defn case-sensitive-filter
"Returns true if string [s] is matched by any query. It's case insensitive"
[{:keys [title]} queries]
(some (fn [querry]
(cs/includes? title querry))
queries))
(defn case-insensitive-filter
"Returns true if string [s] is case-matched by query"
[{:keys [title]} queries]
(case-sensitive-filter {:title (cs/lower-case title)} (map cs/lower-case queries)))
(def known-filters
{"Q" case-sensitive-filter
"q" case-insensitive-filter})
(def known-filter-set (set (keys known-filters)))

View file

@ -47,11 +47,6 @@
~x
result#)))
(defmacro vectorize
"If arg is not a vector, put into vector, otherwise return it"
[v]
(if (vector? v) v [v]))
;; ===== Generic functions ====
(defn indices
@ -74,28 +69,15 @@
{k (map-apply-defaults conf-val default-val)}
{k (nil?-else conf-val default-val)})))))
;; This is a shitty version of reduce-kv
(defn fmap
"Applies function [f] to every key and value in map [m]
Function signature should be (f [key value]).
Key stays unchanged"
Function signature should be (f [key value])."
[f m]
(into
(empty m)
(for [[key val] m]
[key (f key val)])))
(defn fkmap
;; I am horrible with docstrings, I don't deny that
"Applies function [f] to every key and value in map [m]
Function signature should be (f [key value]).
Unlike fmap, you can change key too, so return both {key value} in map"
[f m]
(into
(empty m)
(for [[key val] m]
(f key val))))
(defn expand-home
"Expands ~ to home directory"
;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for

View file

@ -18,6 +18,12 @@
[clojure.data.json :as js])
(:gen-class))
(def GLOBAL-CONFIG
"Global config with defaults for missing entires"
;; I know globals are ew in Clojure but I don't know any
;; better way of doing this
(atom nil))
(def chod-threads-cache
"Cached map of threads that have CHanceOfDeath > configured"
(atom {}))
@ -32,7 +38,7 @@
(defn process-page
"Procesess every thread in page, leaving only relevant information
(:title or :com, :no :chod :last_modified)"
(title no chod)"
([threads-to-index threads-total starting-index] (process-page threads-to-index threads-total starting-index (transient [])))
([remaining-threads threads-total index ret]
(if (empty? remaining-threads)
@ -45,12 +51,11 @@
(:com thread) ;we use body if thread has it
"") ;Thread has neither, this prevents null pointer
:no (:no thread)
:chod (* 100 (float (/ index threads-total)))
:last-modified (:last_modified thread)}))))))
:chod (* 100 (float (/ index threads-total)))}))))))
(defn build-cache
"Build cache of near-death threads so the values don't have to be recalculated on each request."
[pages-to-index threads-per-page threads-total]
[pages-to-index pages-total threads-per-page threads-total]
{:time (System/currentTimeMillis)
:data (vec (flatten (map (fn [single-page]
;; We have to (dec page-number) bcs otherwise we would get the total number of threads
@ -64,9 +69,8 @@
[url] - Url to download data from
[board] - Board to assign cached data to, it's existence is NOT checked here
[starting-page] - From which page consider threads to be fit for near-death cache
Returns :data part of [board] cache
THIS FUNCTION WRITES TO:
rss-thread-watch.watcher/chod-threads-cache"
THIS FUNCTION WRITES TO chod-threads-cache
Returns :data part of [board] cache"
[url board starting-page]
;; Todo: surround with try so we can timeout, 40x and other stuff
(let [catalog (with-open [readr (io/reader url)]
@ -83,7 +87,7 @@
;; This will return nill and that fuck everything up
(println "Refreshed cache for " board)
(reset! (get @chod-threads-cache board)
(build-cache to-index threads-per-page threads-total))))
(build-cache to-index pages-total threads-per-page threads-total))))
(defn board-enabled?
"Checks whether board is enabled in config"
@ -103,13 +107,13 @@
MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY"
[board config]
(let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate)))
board-catalog-url (get-in config `(:boards-enabled ~board :target))
{time-downloaded :time
{data :data
time-downloaded :time
:or {time-downloaded 0}
:as board-atom } @(get @chod-threads-cache board)
;; TODO: This also makes it implictly lazy-load -> if disabled make the check here
time-to-update? (or (nil? board-atom)
(> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))]
(if time-to-update?
(update-board-cache! board-catalog-url board (get-in config [:boards-enabled board :starting-page]))
(update-board-cache! (get-board-url board config) board (get-in config [:boards-enabled board :starting-page]))
@(get @chod-threads-cache board))))