diff --git a/README.org b/README.org
index df34812..e5537c4 100644
--- a/README.org
+++ b/README.org
@@ -1,7 +1,7 @@
#+OPTIONS: toc:nil
* RSS based thread watcher
-Get notifications from your feed reader when your favourite /mlp/ thread is about to die
+Get notifications from your feed reader when your favourite thread is about to die
** Usage
@@ -24,11 +24,14 @@ Right now there is no automated way to generate your feed url but making one by
**** URL parameters
+Please note that default values may vary depending on which host you use, these are the defaults that come with this software but
+anyone running instance of RSS thread watcher can change them
+
| Param name | Values [default] | Can have multiple? | Mandatory? | Short description |
|------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------|
-| board | "mlp" | No | No (not implemented) | Which board to generate feed for, *ONLY* /mlp/ is supported |
-| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles |
-| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death i > chod |
+| board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work |
+| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet |
+| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod |
| repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod |
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
@@ -50,62 +53,54 @@ Standart rules of URLs apply, if you know how to pass params in URL to any websi
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
- Note that ~//~ are not special characters ~q=/general/~ will work as expected and match thread with "/general/" in it's title
- Also note that regex is *NOT* supported for now, so something like ~q=rainbow*~ will only match threads with "rainbow" followed
- immedidatelly by ~*~
- in their title
+ immedidatelly by ~*~ in their title
*** Generating URL interactively
-Coming soon
+Coming soon (not really)
-** Limitations
+** Bugs
-This is an experimental project. There are several limitations:
-- Only supported board is /mlp/ (You can choose your own when self hosting)
-- Only searched threads are those who are in the 50% closer to death part of the catalog
-
-*** Bugs
-
-See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues][issues]]
+See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues?q=&type=all&state=open&labels=1&milestone=0&assignee=0&poster=0][issues]]
** Feature set
-- Planned/finnished features [23%]
+- Planned/finnished features [38%]
- [X] [DONE] Super basic features done (feed, query, repeat)
- [X] Have proper sorting - The most likely to die threads first
- [X] No params request should redirect to url generator or (for now) documentation
- - [ ] Config file instead of hardcoding config values
+ - [X] Config file instead of hardcoding config values
- [ ] Include time of latest data fetch
- [ ] Make threads have preview images taken from the actuall thread OP
- [ ] Show which query matched the thread you were notified of
- [ ] Option to include advanced HTML formating of text (different color text for ChoD etc)
- [ ] Support notification on watched thread re-creation after it died
- [ ] Support notification for thread death
- - [ ] Support multiple boards at once
+ - [X] Support multiple boards at once
- [ ] Support async responses
- [ ] Graal VM support for native configuration
** Self hosting
-This is not supported until release 1.0. You can do it if you figure it out (probably not that hard tbh) but there will be much
-more detailed instructions in the future.
+As of first Beta release, self hosting is supported, please refer to [[file:res/ExampleConfig-documented.edn][documented example config]] for infomration on configuration
+options.
*** Prebuilt
-There will be instructions at some point I promise. Until then you can download binaries from the releases page and run them like
-you would any other java executable, default port is ~6969~.
-
-And you need Java for now if that isn't clear.
-
+Download newest release from [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/releases][releases]] and run them like you would any other java executable, default port is ~6969~
~$ java -jar whatEverNameTheReleaseHas.jar~~
*** From source
+Not officially supported, if you'll attempt this, please, use source from release tarball or checkout ~release~ or ~stable~
+branch. ~dev~ branch is unstable and untested, may not even build. ~stable~ branch should always build, may contain newer version
+than is released.
-If you know Clojure, then just clone and build with lein. If you don't either RTFM to lein or wait before instructions will be
+If you know Clojure, then just clone and build with lein. If you don't either RTFM for lein or wait before instructions will be
avaiabile here.
*** Configuring
-Self hosting is not supported at the moment so no configuration for you.
+All documentation is for now included in [[file:res/ExampleConfig-documented.edn][documented exmample config]].
*** Contributing
diff --git a/project.clj b/project.clj
index 73ebdc5..89fb513 100644
--- a/project.clj
+++ b/project.clj
@@ -1,4 +1,4 @@
-(defproject rss-thread-watch "0.1.0-SNAPSHOT"
+(defproject rss-thread-watch "0.4.0-SNAPSHOT"
:description "RSS based thread watcher"
:url "http://example.com/FIXME"
:license {:name "AGPL-3.0-only"
@@ -7,7 +7,8 @@
[ring/ring-core "1.8.2"]
[ring/ring-jetty-adapter "1.8.2"]
[clj-rss "0.4.0"]
- [org.clojure/data.json "2.4.0"]]
+ [org.clojure/data.json "2.4.0"]
+ [org.clojure/tools.cli "1.1.230"]]
:main ^:skip-aot rss-thread-watch.core
:target-path "target/%s"
:profiles {:uberjar {:aot :all}})
diff --git a/res/ExampleConfig-documented.edn b/res/ExampleConfig-documented.edn
new file mode 100644
index 0000000..87d2a8f
--- /dev/null
+++ b/res/ExampleConfig-documented.edn
@@ -0,0 +1,47 @@
+{:port 6969 ;Port to listen on
+ :default-board "/mlp/" ;Board to be used when no board=x param given
+ ;; Message displayed when requested board is not enabled
+ :board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]"
+ ;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED
+
+ ;; This map defines default values for all enabled boards, if you wish for some board
+ ;; to use different values, specify them bellow in :borads-enabled
+ :boards-defaults {
+ ;; After how many seconds get fresh catalog.json from :target
+ :refresh-rate 300
+ ;; Page from which to start indexing threads, threads on pages with lower
+ ;; numbers will not be detectable by the feed watcher
+ :starting-page 7
+ ;; Default ChOD to use if none is specified by the user
+ :default-chod 94
+ ;; If you want to do some preprocessing beforehand, you can override
+ ;; Target URL for the board, but the response must be same the 4chan API would return
+ ;; /$board/catalog.json will be appended to this link
+ :target "https://api.4chan.org"
+ ;; Commented parts bellow are still unimplemented
+ ;; ------
+ ;; Only download catalog when someone requests feed and cache is old
+ ;; Saves requests to 4chan, usefull for boards that are checked rarely
+ ;; Generally the better option, first request in taken in :refresh-rate may take longer
+ ;; Currently the only option
+ :lazy-load true
+ ;; Whether to allow regex search thru the threads (&qr= param) UNIMPLEMENTED
+ ;; :regex-enable true
+ ;; Wheter to create cache by downloading whole catalog or every required
+ ;; page one by one UNIMPLEMENTED
+ ;; :request-type [:catalog] :pages
+ }
+ ;; List of all boards that are enabled for feed generation
+ ;; Yes they must be all listed manualy for now
+ ;; Each such board must have map of altered config options if aplicable
+ ;; otherwise empty one must be provided
+ :boards-enabled {"/mlp/" {} ;; Empty override map means that defaults are used
+ ;; This means that board "/g/" will have :starting-page set to 7 but all
+ ;; the other config options are copied from :board-defaults
+ "/g/" {:starting-page 7}
+ "/po/" {:starting-page 8
+ :refresh-rate 86400} ;1 day
+ "/p/" {:starting-page 8
+ :refresh-rate 1800} ;30 min
+ }
+}
diff --git a/src/rss_thread_watch/core.clj b/src/rss_thread_watch/core.clj
index a19555f..7799a10 100644
--- a/src/rss_thread_watch/core.clj
+++ b/src/rss_thread_watch/core.clj
@@ -1,4 +1,4 @@
-;; Copyright (C) 2023 Felisp
+;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
@@ -13,50 +13,131 @@
;; along with this program. If not, see .
(ns rss-thread-watch.core
- (:require [ring.adapter.jetty :as jetty]
+ (:require [clojure.java.io :as io]
+ [clojure.edn :as edn]
+ [clojure.tools.cli :refer [parse-opts]]
+ [ring.adapter.jetty :as jetty]
[ring.middleware.params :as rp]
[rss-thread-watch.watcher :as watcher]
- [rss-thread-watch.feed-generator :as feed])
+ [rss-thread-watch.feed-generator :as feed]
+ [rss-thread-watch.utils :as u])
(:gen-class))
-;; Internal default config
-(def CONFIG
- "Internal default config"
- {:target "https://api.4chan.org/mlp/catalog.json" ;Where to download catalog from
- :starting-page 7 ;only monitor threads from this from this page and up
- :refresh-delay (* 60 5) ;Redownload catalog every 5 mins
- :port 6969 ;Listen on 6969
- })
+(def VERSION "0.4.0")
+;; Internal default config
+(def CONFIG-DEFAULT
+ "Internal default config"
+ {:port 6969
+ :default-board "/mlp/"
+ :enable-board-listing true
+ :board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
+ :boards-defaults {:refresh-rate 300
+ :starting-page 7
+ :default-chod 94
+ :target "https://api.4chan.org"
+ :lazy-load true}
+ :boards-enabled {"/mlp/" {:lazy-load false}
+ "/g/" {:starting-page 7}
+ "/po/" {:starting-page 8
+ :refresh-rate 86400}
+ "/p/" {:starting-page 8
+ :refresh-rate 1800}}})
+
+(def cli-options
+ "Configuration defining program arguments for cli.tools"
+ [["-v" "--version" "Print version and license information"]
+ ["-h" "--help" "Prints help"]
+ ["-c" "--config CONFIG_FILE" "Specify config file to use for this run"
+ :default "./config.edn"
+ :validate [#(u/file-exists? %) "Specified config file does not exist or is not readable"]]
+ [nil "--print-default-config" "Prints internal default config file to STDOUT and exits"]])
+
+;; Todo: Think of a way to start repeated download for every catalog efficiently
(defn set-interval
"Calls function every ms"
+ ^{:deprecated true}
[callback ms]
(future (while true (do (try
(callback)
(println "Recached")
(catch Exception e
(binding [*out* *err*]
- (println "Error while updating cache: " e ", retrying in 5 minutes"))))
+ (println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes"))))
(Thread/sleep ms)))))
+(defn load-config
+ "Attempts to load config from file [f].
+ Returns loaded config map or nil if failed"
+ [f]
+ (let [fl (io/as-file f)]
+ (when (.exists fl)
+ (with-open [r (io/reader fl)]
+ (edn/read (java.io.PushbackReader. r))))))
+
+(defn config-fill-board-defaults
+ "Fills every enabled board with default config values"
+ [config]
+ (let [defaults (:boards-defaults config)]
+ (dissoc (update-in config
+ '(:boards-enabled)
+ (fn [mp]
+ (u/fmap (fn [k v]
+ (u/map-apply-defaults v defaults))
+ mp)))
+ :boards-defaults)))
+
+(defn get-some-config
+ "Attempts to get config somehow,
+ first from [custom-file], if it's nil,
+ then from ./config.edn file.
+ If is neither exists, default internal one is used."
+ [custom-file]
+ (config-fill-board-defaults
+ ;; TODO: There has to be try/catch for when file is invalid edn
+ ;; This is gonna be done when config validation comes in Beta 2
+ (let [file-to-try (u/nil?-else custom-file
+ "./config.edn")]
+ (u/when-else (load-config file-to-try)
+ CONFIG-DEFAULT))))
+
(defn -main
"Entry point, starts webserver"
[& args]
- (println "Starting on port: " (:port CONFIG)
- "\nGonna recache every: " (:refresh-delay CONFIG) "s")
- (set-interval (fn []
- (println "Starting cache update")
- (watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG)))
- (* 1000 (:refresh-delay CONFIG)))
- (jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG)
- :join? true}))
+ (let [parsed-args (parse-opts args cli-options)
+ options (get parsed-args :options)]
+ (when-let [err (get parsed-args :errors)]
+ (println "Error: " err)
+ (System/exit 1))
+ (when (get options :version)
+ (println "RSS Thread Watcher " VERSION " Licensed under AGPL-3.0-only")
+ (System/exit 0))
+ (when (get options :help)
+ (println "RSS Thread Watcher help:\n" (get parsed-args :summary))
+ (System/exit 0))
+ (when (get options :print-default-config)
+ (println ";;Default internal config file from RSS Thread Watcher " VERSION)
+ (clojure.pprint/pprint CONFIG-DEFAULT)
+ ;; In case someone was copying by hand, this might be useful
+ (println ";;END of Default internal config file")
+ (System/exit 0))
+
+ (let [config (get-some-config (:config options))]
+ ;; TODO: probably refactor to use separate config.clj file when validation will be added
+ ;; Init the few globals we have
+ (reset! watcher/GLOBAL-CONFIG config)
+ (reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
+ (reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config))
+ (clojure.pprint/pprint config)
+ (jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG-DEFAULT)
+ :join? true}))))
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
(defn repl-main
"Development entry point"
[]
(jetty/run-jetty (rp/wrap-params #'feed/http-handler)
- {:port (:port CONFIG)
+ {:port (:port CONFIG-DEFAULT)
;; Dont block REPL thread
:join? false}))
;; (repl-main)
diff --git a/src/rss_thread_watch/feed_generator.clj b/src/rss_thread_watch/feed_generator.clj
index 2ec8388..7c6c15d 100644
--- a/src/rss_thread_watch/feed_generator.clj
+++ b/src/rss_thread_watch/feed_generator.clj
@@ -1,4 +1,4 @@
-;; Copyright (C) 2023 Felisp
+;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
@@ -18,15 +18,12 @@
[ring.util.response :as response]
[clj-rss.core :as rss]
[clojure.string :as s]
- [rss-thread-watch.watcher :as watcher])
+ [rss-thread-watch.watcher :as watcher]
+ [rss-thread-watch.utils :as ut])
(:gen-class))
-
-(defn indices
- ;; https://stackoverflow.com/questions/8641305/find-index-of-an-element-matching-a-predicate-in-clojure
- "Returns indexes of elements passing predicate"
- [pred coll]
- (keep-indexed #(when (pred %2) %1) coll))
+(def boards-enabled-cache
+ (atom nil))
(defn new-guid-always
"Generates always unique GUID for Feed item.
@@ -51,12 +48,14 @@
(defn filter-chod-posts
"Return list of all threads with equal or higher ChoD than requested
- READS FROM GLOBALS: watcher.time-of-cache" ;Todo: best thing would be to add timestamp to cache
- [query-vec chod-treshold repeat? cache]
- (let [time-of-generation @watcher/time-of-cache
+ READS FROM GLOBALS: watcher.time-of-cache"
+ [query-vec chod-treshold repeat? board-cache]
+
+ (let [{time-of-generation :time
+ cache :data} board-cache
guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
update-only-guid)
- cache-start-index (first (indices (fn [x] (>= (:chod x) chod-treshold))
+ cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold))
cache))
;; So we don't have to search thru everything we have cached
needed-cache-part (subvec cache cache-start-index)
@@ -66,7 +65,7 @@
;; Would be so much easier for user to figure out why is it showing
;; and it would solve the problem of super long titles (or OPs instead of titles)
(when (some (fn [querry]
- (s/includes? title querry))
+ (s/includes? (s/lower-case title) (s/lower-case querry)))
query-vec)
t)))
(reverse needed-cache-part))]
@@ -76,9 +75,9 @@
(defn thread-to-rss-item
"If I wasnt retarded I could have made the cached version look like
rss item already but what can you do. I'll refactor I promise, I just need this done ASAP" ;Todo: do what the docstring says
- [t]
+ [t] ;TODO: oh Luna the hardcodes ;;RESUME
(let [link-url (str "https://boards.4chan.org/mlp/thread/" (:no t))] ; jesus, well I said only /mlp/ is supported now so fuck it
- {:title (format "%.2f%% - %s" (:chod t) (:title t))
+ {:title (format "%.2f%% - %s" (:chod t) (:title t)) ;TODO: Generate link from the target somehow, or just include it from API response
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
:link link-url
@@ -88,7 +87,7 @@
"Generates feed from matching items"
[query-vec chod-treshold repeat? cache]
(let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
- head {:title "RSS Thread watcher v0.1"
+ head {:title "RSS Thread watcher v0.4" ;TODO: hardcoded string here, remake to reference to config.clj
:link "https://tools.treebrary.org/thread-watcher/feed.xml"
:feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
:description "RSS based thread watcher"}
@@ -100,9 +99,11 @@
READS FROM GLOBALS:
rss-thread-watch.watcher.chod-threads-cache
- rss-thread-watch.core.CONFIG"
+ rss-thread-watch.watcher.GLOBAL-CONFIG" ;TODO: Update if it really reads from there anymore
[rqst]
- (try (let [{{chod "chod" :or {chod "94"}
+ (try (let [{{chod "chod"
+ board "board" :or {chod "94"
+ board (get @watcher/GLOBAL-CONFIG :default-board)}
:as prms} :params
uri :uri} rqst
qrs (prms "q")
@@ -113,19 +114,23 @@
chod)]
(try ;If we can't parse number from chod, use default 94
(if (or (vector? chod)
- (<= (Integer/parseInt chod) 60)) ; Never accept chod lower that 60 TODO: don't hardcode this
+ (<= (Integer/parseInt chod) 60)) ; Never accept chod lower than 60 TODO: don't hardcode this
60 (Integer/parseInt chod))
(catch Exception e
94)))
cache @watcher/chod-threads-cache]
- ;; (println "RCVD: " rqst)
- (println rqst)
+ (println "\n\nRCVD: " rqst)
+ ;; (println rqst)
;; ====== Errors =====
;; Something other than feed.xml requested
(when-not (s/ends-with? uri "feed.xml")
(throw (ex-info "404" {:status 404
:header {"Content-Type" "text/plain"}
:body "404 This server has nothing but /feed.xml"})))
+ (when-not (contains? @boards-enabled-cache board)
+ (throw (ex-info "403" {:status 403
+ :header {"Content-Type" "text/plain"}
+ :body (get @watcher/GLOBAL-CONFIG :board-disabled-message)})))
;; No url params -> we redirect to documentation about params
(when (empty? prms)
(throw (ex-info "302"
@@ -149,13 +154,15 @@
;; There shouldn't be any problems with this mime type but if there are
;; replace with "text/xml", or even better, get RSS reader that is not utter shit
:header {"Content-Type" "application/rss+xml"}
- :body (generate-feed queries real-chod repeat? cache)})
+ :body (generate-feed queries real-chod repeat? (watcher/get-thread-data board @watcher/GLOBAL-CONFIG))})
(catch Exception e
;; Ex-info has been crafted to match HTTP response body so we can send it
(if-let [caught (ex-data e)]
caught ;We have custom crafted error
- {:status 500 ;Something else fucked up, we print what happened
- :header {"Content-Type" "text/plain"}
- :body (str "500 - Something fucked up while generating feed, If you decide to report it, please include url adress you used:\n"
- (ex-cause e) "\n"
- e)}))))
+ (do
+ (print "WTF??: " e)
+ {:status 500 ;Something else fucked up, we print what happened
+ :header {"Content-Type" "text/plain"}
+ :body (str "500 - Something fucked up while generating feed, If you decide to report it, please include url adress you used:\n"
+ (ex-cause e) "\n"
+ e)})))))
diff --git a/src/rss_thread_watch/utils.clj b/src/rss_thread_watch/utils.clj
new file mode 100644
index 0000000..db53c12
--- /dev/null
+++ b/src/rss_thread_watch/utils.clj
@@ -0,0 +1,101 @@
+;; Copyright (C) 2024 Felisp
+;;
+;; This program is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU Affero General Public License as published by
+;; the Free Software Foundation, version 3 of the License.
+;;
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;; GNU Affero General Public License for more details.
+;;
+;; You should have received a copy of the GNU Affero General Public License
+;; along with this program. If not, see .
+
+(ns rss-thread-watch.utils
+ "Util functions"
+ (:gen-class))
+
+;; ===== Macros =====
+(defmacro nil?-else
+ "Return x unless it's nil, the return y"
+ [x y]
+ `(let [result# ~x]
+ (if (nil? result#)
+ ~y
+ result#)))
+
+(defmacro when-else
+ "Evaluates [tst], if it's truthy value returns that value.
+ If it's not, execute everything in [else] and return last expr."
+ [tst & else]
+ `(let [res# ~tst]
+ (if res#
+ res#
+ (do ~@else))))
+
+(defmacro ret=
+ "compares two values using [=]. If the result is true
+ returns the value, else the result of [=].
+
+ Usefull with if-else"
+ [x y]
+ `(let [x# ~x
+ y# ~y
+ result# ~(= x y)]
+ (if result#
+ ~x
+ result#)))
+
+;; ===== Generic functions ====
+
+(defn indices
+ ;; https://stackoverflow.com/questions/8641305/find-index-of-an-element-matching-a-predicate-in-clojure
+ "Returns indexes of elements passing predicate"
+ [pred coll]
+ (keep-indexed #(when (pred %2) %1) coll))
+
+(defn map-apply-defaults
+ "Apply default values from [defaults] to keys not present in [conf]
+ Order is very important.
+ Thus all missing values from config are replaced by defaults"
+ [conf defaults]
+ (into conf
+ (for [k (keys defaults)]
+ (let [conf-val (get conf k)
+ default-val (get defaults k)]
+ (if (and (map? conf-val) ; both are maps, we have to go level deeper
+ (map? default-val)) ; If only one is, we don't care cus then it's just assigment
+ {k (map-apply-defaults conf-val default-val)}
+ {k (nil?-else conf-val default-val)})))))
+
+(defn fmap
+ "Applies function [f] to every key and value in map [m]
+ Function signature should be (f [key value])."
+ [f m]
+ (into
+ (empty m)
+ (for [[key val] m]
+ [key (f key val)])))
+
+(defn expand-home
+ "Expands ~ to home directory"
+ ;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for
+ [s]
+ (if (clojure.string/starts-with? s "~")
+ (clojure.string/replace-first s "~" (System/getProperty "user.home"))
+ s))
+
+(defn expand-path
+ [s]
+ (if (clojure.string/starts-with? s "./")
+ (clojure.string/replace-first s "." (System/getProperty "user.dir"))
+ (expand-home s)))
+
+(defn file-exists?
+ "Returns true if file exists"
+ [file]
+ (let [path (if (vector? file)
+ (first file)
+ file)]
+ (.exists (clojure.java.io/file (expand-path path)))))
diff --git a/src/rss_thread_watch/watcher.clj b/src/rss_thread_watch/watcher.clj
index eaf2df8..93b068c 100644
--- a/src/rss_thread_watch/watcher.clj
+++ b/src/rss_thread_watch/watcher.clj
@@ -1,4 +1,4 @@
-;; Copyright (C) 2023 Felisp
+;; Copyright (C) 2024 Felisp
;;
;; This program is free software: you can redistribute it and/or modify
;; it under the terms of the GNU Affero General Public License as published by
@@ -18,11 +18,23 @@
[clojure.data.json :as js])
(:gen-class))
-(def chod-threads-cache
- "Cached vector of threads that have CHanceOfDeath > configured"
- (atom []))
+(def GLOBAL-CONFIG
+ "Global config with defaults for missing entires"
+ ;; I know globals are ew in Clojure but I don't know any
+ ;; better way of doing this
+ (atom nil))
-(def time-of-cache (atom 0))
+(def chod-threads-cache
+ "Cached map of threads that have CHanceOfDeath > configured"
+ (atom {}))
+
+(defn generate-chod-cache-structure
+ "Generates initial structure for global cache
+ Structure is returned, you have to set it yourself"
+ [config]
+ (let [ks (keys (:boards-enabled config))]
+ (zipmap ks
+ (repeatedly (count ks) #(atom nil)))))
(defn process-page
"Procesess every thread in page, leaving only relevant information
@@ -44,27 +56,64 @@
(defn build-cache
"Build cache of near-death threads so the values don't have to be recalculated on each request."
[pages-to-index pages-total threads-per-page threads-total]
- (vec (flatten (map (fn [single-page]
- ;; We have to (dec page-number) bcs otherwise we would get the total number of threads
- ;; including the whole page of threads
- (let [page-number (dec (:page single-page))] ; inc to get to the actuall page
- (process-page (:threads single-page) threads-total (inc (* page-number threads-per-page)))))
- pages-to-index))))
+ {:time (System/currentTimeMillis)
+ :data (vec (flatten (map (fn [single-page]
+ ;; We have to (dec page-number) bcs otherwise we would get the total number of threads
+ ;; including the whole page of threads
+ (let [page-number (dec (:page single-page))] ; inc to get to the actuall page
+ (process-page (:threads single-page) threads-total (inc (* page-number threads-per-page)))))
+ pages-to-index)))})
-(defn update-thread-cache!
+(defn update-board-cache!
"Updates cache of near-death threads. Writes to chod-threads-cache as side effect.
[url] - Url to download data from
- [starting-page] - From which page consider threads to be fit for near-death cache"
- [url starting-page]
- ;; Todo: surround with try so we can timeout and other stuff
+ [board] - Board to assign cached data to, it's existence is NOT checked here
+ [starting-page] - From which page consider threads to be fit for near-death cache
+ THIS FUNCTION WRITES TO chod-threads-cache
+ Returns :data part of [board] cache"
+ [url board starting-page]
+ ;; Todo: surround with try so we can timeout, 40x and other stuff
(let [catalog (with-open [readr (io/reader url)]
(js/read readr :key-fn keyword))
pages-total (count catalog)
;; universal calculation for total number of threads:
- ;; (pages-total-1) * threadsPerPage + threadsOnLastpage ;;accounts for boards which have stickied threads making them have 11pages
- threads-per-page (count (:threads (first catalog)))
+ ;; (pages-total -1) * threadsPerPage + threadsOnLastpage ;;accounts for boards which have stickied threads making them have 11pages
+ threads-per-page (count (:threads (first catalog))) ;; TODO: last could be remade to peek if it's a vector
threads-total (+ (* threads-per-page (dec pages-total)) (count (:threads (last catalog)))) ;; Todo: Yeah, maybe this calculation could be refactored into let
to-index (filter (fn [item]
(<= starting-page (:page item))) catalog)]
- (reset! chod-threads-cache (build-cache to-index pages-total threads-per-page threads-total))
- (reset! time-of-cache (System/currentTimeMillis))))
+ ;; TODO: there absolutely must be try catch for missing - not enabled boards,
+ ;; This is probably resolved now, but keeping it just in case
+ ;; This will return nill and that fuck everything up
+ (println "Refreshed cache for " board)
+ (reset! (get @chod-threads-cache board)
+ (build-cache to-index pages-total threads-per-page threads-total))))
+
+(defn board-enabled?
+ "Checks whether board is enabled in config"
+ [board config]
+ (contains? board (keys (get config :boards-enabled))))
+
+(defn get-board-url
+ "Gets board url from :target if "
+ [board config]
+ ;; TODO: jesus, this needs sanitization and should be probably crafted by some URL class
+ (str (get-in config [:boards-enabled board :target]) board "catalog.json"))
+
+(defn get-thread-data
+ "Gets thread cache for given board.
+ If board is lazy loaded, downloads new one if needed.
+
+ MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY"
+ [board config]
+ (let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate)))
+ {data :data
+ time-downloaded :time
+ :or {time-downloaded 0}
+ :as board-atom } @(get @chod-threads-cache board)
+ ;; TODO: This also makes it implictly lazy-load -> if disabled make the check here
+ time-to-update? (or (nil? board-atom)
+ (> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))]
+ (if time-to-update?
+ (update-board-cache! (get-board-url board config) board (get-in config [:boards-enabled board :starting-page]))
+ @(get @chod-threads-cache board))))
diff --git a/test/rss_thread_watch/utils_test.clj b/test/rss_thread_watch/utils_test.clj
new file mode 100644
index 0000000..92525c3
--- /dev/null
+++ b/test/rss_thread_watch/utils_test.clj
@@ -0,0 +1,67 @@
+;; Copyright (C) 2024 Felisp
+;;
+;; This program is free software: you can redistribute it and/or modify
+;; it under the terms of the GNU Affero General Public License as published by
+;; the Free Software Foundation, version 3 of the License.
+;;
+;; This program is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;; GNU Affero General Public License for more details.
+;;
+;; You should have received a copy of the GNU Affero General Public License
+;; along with this program. If not, see .
+
+(ns rss-thread-watch.utils-test
+ (:require [clojure.test :refer :all]
+ [rss-thread-watch.utils :refer :all]))
+
+(def first-map
+ "Example config map with two keys"
+ {:a :b
+ :c "c"
+ :nested {:fst 1 :scnd {:super :nested}}})
+
+(def pony-map
+ "Map containing none of the items in map 1"
+ {:best-pony "Twilight Sparkle"})
+
+(def conflicting-basic-merge (conj pony-map {:a 17 :c 15}))
+
+(def deep-pony-map {:a "x"
+ :c :something-else
+ :nested {:ponies "everywhere"
+ :fst 69}})
+
+(def empty-map {})
+
+(deftest map-apply-defaults-test
+ (testing "Full and no-replace"
+ (is (= first-map (map-apply-defaults first-map empty-map))
+ "No defaults should return conf map unchanged")
+ (is (= first-map (map-apply-defaults empty-map first-map))
+ "Empty map should be completely replaced by defaults"))
+
+ (testing "Basic merge"
+ (is (= (conj pony-map first-map) (map-apply-defaults first-map pony-map))
+ "When all keys unique, maps should be conjd")
+ (is (= (conj first-map pony-map) (map-apply-defaults first-map pony-map))
+ "When all keys unique, maps should be conjd, order matters")
+ (is (= (conj first-map pony-map) (map-apply-defaults pony-map first-map))
+ "When all keys unique, maps should be conjd, more order that matters")
+ (is (= (conj first-map pony-map) (map-apply-defaults first-map pony-map))
+ "Conflicting basic merge"))
+ ;; Most important part, this is the reason we have the function in the first place
+ ;; Conj wont merge deep
+ (testing "Nested merge"
+ (is (= {:a :b
+ :c "c"
+ :nested {:ponies "everywhere"
+ :fst 1
+ :scnd {:super :nested}}}
+ (map-apply-defaults first-map deep-pony-map)))))
+
+(deftest fmap-test
+ (testing "Applying function to values of map"
+ (is (= {:a 2 :b 3} (fmap (fn [k v] (inc v))
+ {:a 1 :b 2})))))