Merge pull request 'Merge release Beta 1 into stable' (#21) from dev into stable
Reviewed-on: https://git.treebrary.org/Treebrary.org/rss-thread-watcher/pulls/21
This commit is contained in:
commit
67f135a508
8 changed files with 444 additions and 96 deletions
49
README.org
49
README.org
|
@ -1,7 +1,7 @@
|
|||
#+OPTIONS: toc:nil
|
||||
* RSS based thread watcher
|
||||
|
||||
Get notifications from your feed reader when your favourite /mlp/ thread is about to die
|
||||
Get notifications from your feed reader when your favourite thread is about to die
|
||||
|
||||
** Usage
|
||||
|
||||
|
@ -24,11 +24,14 @@ Right now there is no automated way to generate your feed url but making one by
|
|||
|
||||
**** URL parameters
|
||||
|
||||
Please note that default values may vary depending on which host you use, these are the defaults that come with this software but
|
||||
anyone running instance of RSS thread watcher can change them
|
||||
|
||||
| Param name | Values [default] | Can have multiple? | Mandatory? | Short description |
|
||||
|------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------|
|
||||
| board | "mlp" | No | No (not implemented) | Which board to generate feed for, *ONLY* /mlp/ is supported |
|
||||
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles |
|
||||
| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death i > chod |
|
||||
| board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work |
|
||||
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet |
|
||||
| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod |
|
||||
| repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod |
|
||||
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
|
||||
|
||||
|
@ -50,62 +53,54 @@ Standart rules of URLs apply, if you know how to pass params in URL to any websi
|
|||
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
|
||||
- Note that ~//~ are not special characters ~q=/general/~ will work as expected and match thread with "/general/" in it's title
|
||||
- Also note that regex is *NOT* supported for now, so something like ~q=rainbow*~ will only match threads with "rainbow" followed
|
||||
immedidatelly by ~*~
|
||||
in their title
|
||||
immedidatelly by ~*~ in their title
|
||||
|
||||
*** Generating URL interactively
|
||||
|
||||
Coming soon
|
||||
Coming soon (not really)
|
||||
|
||||
** Limitations
|
||||
** Bugs
|
||||
|
||||
This is an experimental project. There are several limitations:
|
||||
- Only supported board is /mlp/ (You can choose your own when self hosting)
|
||||
- Only searched threads are those who are in the 50% closer to death part of the catalog
|
||||
|
||||
*** Bugs
|
||||
|
||||
See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues][issues]]
|
||||
See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues?q=&type=all&state=open&labels=1&milestone=0&assignee=0&poster=0][issues]]
|
||||
|
||||
** Feature set
|
||||
|
||||
- Planned/finnished features [23%]
|
||||
- Planned/finnished features [38%]
|
||||
- [X] [DONE] Super basic features done (feed, query, repeat)
|
||||
- [X] Have proper sorting - The most likely to die threads first
|
||||
- [X] No params request should redirect to url generator or (for now) documentation
|
||||
- [ ] Config file instead of hardcoding config values
|
||||
- [X] Config file instead of hardcoding config values
|
||||
- [ ] Include time of latest data fetch
|
||||
- [ ] Make threads have preview images taken from the actuall thread OP
|
||||
- [ ] Show which query matched the thread you were notified of
|
||||
- [ ] Option to include advanced HTML formating of text (different color text for ChoD etc)
|
||||
- [ ] Support notification on watched thread re-creation after it died
|
||||
- [ ] Support notification for thread death
|
||||
- [ ] Support multiple boards at once
|
||||
- [X] Support multiple boards at once
|
||||
- [ ] Support async responses
|
||||
- [ ] Graal VM support for native configuration
|
||||
|
||||
** Self hosting
|
||||
|
||||
This is not supported until release 1.0. You can do it if you figure it out (probably not that hard tbh) but there will be much
|
||||
more detailed instructions in the future.
|
||||
As of first Beta release, self hosting is supported, please refer to [[file:res/ExampleConfig-documented.edn][documented example config]] for infomration on configuration
|
||||
options.
|
||||
|
||||
*** Prebuilt
|
||||
|
||||
There will be instructions at some point I promise. Until then you can download binaries from the releases page and run them like
|
||||
you would any other java executable, default port is ~6969~.
|
||||
|
||||
And you need Java for now if that isn't clear.
|
||||
|
||||
Download newest release from [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/releases][releases]] and run them like you would any other java executable, default port is ~6969~
|
||||
~$ java -jar whatEverNameTheReleaseHas.jar~~
|
||||
|
||||
*** From source
|
||||
Not officially supported, if you'll attempt this, please, use source from release tarball or checkout ~release~ or ~stable~
|
||||
branch. ~dev~ branch is unstable and untested, may not even build. ~stable~ branch should always build, may contain newer version
|
||||
than is released.
|
||||
|
||||
If you know Clojure, then just clone and build with lein. If you don't either RTFM to lein or wait before instructions will be
|
||||
If you know Clojure, then just clone and build with lein. If you don't either RTFM for lein or wait before instructions will be
|
||||
avaiabile here.
|
||||
|
||||
*** Configuring
|
||||
|
||||
Self hosting is not supported at the moment so no configuration for you.
|
||||
All documentation is for now included in [[file:res/ExampleConfig-documented.edn][documented exmample config]].
|
||||
|
||||
*** Contributing
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
(defproject rss-thread-watch "0.1.0-SNAPSHOT"
|
||||
(defproject rss-thread-watch "0.4.0-SNAPSHOT"
|
||||
:description "RSS based thread watcher"
|
||||
:url "http://example.com/FIXME"
|
||||
:license {:name "AGPL-3.0-only"
|
||||
|
@ -7,7 +7,8 @@
|
|||
[ring/ring-core "1.8.2"]
|
||||
[ring/ring-jetty-adapter "1.8.2"]
|
||||
[clj-rss "0.4.0"]
|
||||
[org.clojure/data.json "2.4.0"]]
|
||||
[org.clojure/data.json "2.4.0"]
|
||||
[org.clojure/tools.cli "1.1.230"]]
|
||||
:main ^:skip-aot rss-thread-watch.core
|
||||
:target-path "target/%s"
|
||||
:profiles {:uberjar {:aot :all}})
|
||||
|
|
47
res/ExampleConfig-documented.edn
Normal file
47
res/ExampleConfig-documented.edn
Normal file
|
@ -0,0 +1,47 @@
|
|||
{:port 6969 ;Port to listen on
|
||||
:default-board "/mlp/" ;Board to be used when no board=x param given
|
||||
;; Message displayed when requested board is not enabled
|
||||
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]"
|
||||
;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED
|
||||
|
||||
;; This map defines default values for all enabled boards, if you wish for some board
|
||||
;; to use different values, specify them bellow in :borads-enabled
|
||||
:boards-defaults {
|
||||
;; After how many seconds get fresh catalog.json from :target
|
||||
:refresh-rate 300
|
||||
;; Page from which to start indexing threads, threads on pages with lower
|
||||
;; numbers will not be detectable by the feed watcher
|
||||
:starting-page 7
|
||||
;; Default ChOD to use if none is specified by the user
|
||||
:default-chod 94
|
||||
;; If you want to do some preprocessing beforehand, you can override
|
||||
;; Target URL for the board, but the response must be same the 4chan API would return
|
||||
;; /$board/catalog.json will be appended to this link
|
||||
:target "https://api.4chan.org"
|
||||
;; Commented parts bellow are still unimplemented
|
||||
;; ------
|
||||
;; Only download catalog when someone requests feed and cache is old
|
||||
;; Saves requests to 4chan, usefull for boards that are checked rarely
|
||||
;; Generally the better option, first request in taken in :refresh-rate may take longer
|
||||
;; Currently the only option
|
||||
:lazy-load true
|
||||
;; Whether to allow regex search thru the threads (&qr= param) UNIMPLEMENTED
|
||||
;; :regex-enable true
|
||||
;; Wheter to create cache by downloading whole catalog or every required
|
||||
;; page one by one UNIMPLEMENTED
|
||||
;; :request-type [:catalog] :pages
|
||||
}
|
||||
;; List of all boards that are enabled for feed generation
|
||||
;; Yes they must be all listed manualy for now
|
||||
;; Each such board must have map of altered config options if aplicable
|
||||
;; otherwise empty one must be provided
|
||||
:boards-enabled {"/mlp/" {} ;; Empty override map means that defaults are used
|
||||
;; This means that board "/g/" will have :starting-page set to 7 but all
|
||||
;; the other config options are copied from :board-defaults
|
||||
"/g/" {:starting-page 7}
|
||||
"/po/" {:starting-page 8
|
||||
:refresh-rate 86400} ;1 day
|
||||
"/p/" {:starting-page 8
|
||||
:refresh-rate 1800} ;30 min
|
||||
}
|
||||
}
|
|
@ -1,4 +1,4 @@
|
|||
;; Copyright (C) 2023 Felisp
|
||||
;; Copyright (C) 2024 Felisp
|
||||
;;
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU Affero General Public License as published by
|
||||
|
@ -13,50 +13,131 @@
|
|||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
(ns rss-thread-watch.core
|
||||
(:require [ring.adapter.jetty :as jetty]
|
||||
(:require [clojure.java.io :as io]
|
||||
[clojure.edn :as edn]
|
||||
[clojure.tools.cli :refer [parse-opts]]
|
||||
[ring.adapter.jetty :as jetty]
|
||||
[ring.middleware.params :as rp]
|
||||
[rss-thread-watch.watcher :as watcher]
|
||||
[rss-thread-watch.feed-generator :as feed])
|
||||
[rss-thread-watch.feed-generator :as feed]
|
||||
[rss-thread-watch.utils :as u])
|
||||
(:gen-class))
|
||||
|
||||
;; Internal default config
|
||||
(def CONFIG
|
||||
"Internal default config"
|
||||
{:target "https://api.4chan.org/mlp/catalog.json" ;Where to download catalog from
|
||||
:starting-page 7 ;only monitor threads from this from this page and up
|
||||
:refresh-delay (* 60 5) ;Redownload catalog every 5 mins
|
||||
:port 6969 ;Listen on 6969
|
||||
})
|
||||
(def VERSION "0.4.0")
|
||||
|
||||
;; Internal default config
|
||||
(def CONFIG-DEFAULT
|
||||
"Internal default config"
|
||||
{:port 6969
|
||||
:default-board "/mlp/"
|
||||
:enable-board-listing true
|
||||
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
|
||||
:boards-defaults {:refresh-rate 300
|
||||
:starting-page 7
|
||||
:default-chod 94
|
||||
:target "https://api.4chan.org"
|
||||
:lazy-load true}
|
||||
:boards-enabled {"/mlp/" {:lazy-load false}
|
||||
"/g/" {:starting-page 7}
|
||||
"/po/" {:starting-page 8
|
||||
:refresh-rate 86400}
|
||||
"/p/" {:starting-page 8
|
||||
:refresh-rate 1800}}})
|
||||
|
||||
(def cli-options
|
||||
"Configuration defining program arguments for cli.tools"
|
||||
[["-v" "--version" "Print version and license information"]
|
||||
["-h" "--help" "Prints help"]
|
||||
["-c" "--config CONFIG_FILE" "Specify config file to use for this run"
|
||||
:default "./config.edn"
|
||||
:validate [#(u/file-exists? %) "Specified config file does not exist or is not readable"]]
|
||||
[nil "--print-default-config" "Prints internal default config file to STDOUT and exits"]])
|
||||
|
||||
;; Todo: Think of a way to start repeated download for every catalog efficiently
|
||||
(defn set-interval
|
||||
"Calls function every ms"
|
||||
^{:deprecated true}
|
||||
[callback ms]
|
||||
(future (while true (do (try
|
||||
(callback)
|
||||
(println "Recached")
|
||||
(catch Exception e
|
||||
(binding [*out* *err*]
|
||||
(println "Error while updating cache: " e ", retrying in 5 minutes"))))
|
||||
(println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes"))))
|
||||
(Thread/sleep ms)))))
|
||||
|
||||
(defn load-config
|
||||
"Attempts to load config from file [f].
|
||||
Returns loaded config map or nil if failed"
|
||||
[f]
|
||||
(let [fl (io/as-file f)]
|
||||
(when (.exists fl)
|
||||
(with-open [r (io/reader fl)]
|
||||
(edn/read (java.io.PushbackReader. r))))))
|
||||
|
||||
(defn config-fill-board-defaults
|
||||
"Fills every enabled board with default config values"
|
||||
[config]
|
||||
(let [defaults (:boards-defaults config)]
|
||||
(dissoc (update-in config
|
||||
'(:boards-enabled)
|
||||
(fn [mp]
|
||||
(u/fmap (fn [k v]
|
||||
(u/map-apply-defaults v defaults))
|
||||
mp)))
|
||||
:boards-defaults)))
|
||||
|
||||
(defn get-some-config
|
||||
"Attempts to get config somehow,
|
||||
first from [custom-file], if it's nil,
|
||||
then from ./config.edn file.
|
||||
If is neither exists, default internal one is used."
|
||||
[custom-file]
|
||||
(config-fill-board-defaults
|
||||
;; TODO: There has to be try/catch for when file is invalid edn
|
||||
;; This is gonna be done when config validation comes in Beta 2
|
||||
(let [file-to-try (u/nil?-else custom-file
|
||||
"./config.edn")]
|
||||
(u/when-else (load-config file-to-try)
|
||||
CONFIG-DEFAULT))))
|
||||
|
||||
(defn -main
|
||||
"Entry point, starts webserver"
|
||||
[& args]
|
||||
(println "Starting on port: " (:port CONFIG)
|
||||
"\nGonna recache every: " (:refresh-delay CONFIG) "s")
|
||||
(set-interval (fn []
|
||||
(println "Starting cache update")
|
||||
(watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG)))
|
||||
(* 1000 (:refresh-delay CONFIG)))
|
||||
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG)
|
||||
:join? true}))
|
||||
(let [parsed-args (parse-opts args cli-options)
|
||||
options (get parsed-args :options)]
|
||||
(when-let [err (get parsed-args :errors)]
|
||||
(println "Error: " err)
|
||||
(System/exit 1))
|
||||
(when (get options :version)
|
||||
(println "RSS Thread Watcher " VERSION " Licensed under AGPL-3.0-only")
|
||||
(System/exit 0))
|
||||
(when (get options :help)
|
||||
(println "RSS Thread Watcher help:\n" (get parsed-args :summary))
|
||||
(System/exit 0))
|
||||
(when (get options :print-default-config)
|
||||
(println ";;Default internal config file from RSS Thread Watcher " VERSION)
|
||||
(clojure.pprint/pprint CONFIG-DEFAULT)
|
||||
;; In case someone was copying by hand, this might be useful
|
||||
(println ";;END of Default internal config file")
|
||||
(System/exit 0))
|
||||
|
||||
(let [config (get-some-config (:config options))]
|
||||
;; TODO: probably refactor to use separate config.clj file when validation will be added
|
||||
;; Init the few globals we have
|
||||
(reset! watcher/GLOBAL-CONFIG config)
|
||||
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
|
||||
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config))
|
||||
(clojure.pprint/pprint config)
|
||||
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG-DEFAULT)
|
||||
:join? true}))))
|
||||
|
||||
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
|
||||
(defn repl-main
|
||||
"Development entry point"
|
||||
[]
|
||||
(jetty/run-jetty (rp/wrap-params #'feed/http-handler)
|
||||
{:port (:port CONFIG)
|
||||
{:port (:port CONFIG-DEFAULT)
|
||||
;; Dont block REPL thread
|
||||
:join? false}))
|
||||
;; (repl-main)
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
;; Copyright (C) 2023 Felisp
|
||||
;; Copyright (C) 2024 Felisp
|
||||
;;
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU Affero General Public License as published by
|
||||
|
@ -18,15 +18,12 @@
|
|||
[ring.util.response :as response]
|
||||
[clj-rss.core :as rss]
|
||||
[clojure.string :as s]
|
||||
[rss-thread-watch.watcher :as watcher])
|
||||
[rss-thread-watch.watcher :as watcher]
|
||||
[rss-thread-watch.utils :as ut])
|
||||
(:gen-class))
|
||||
|
||||
|
||||
(defn indices
|
||||
;; https://stackoverflow.com/questions/8641305/find-index-of-an-element-matching-a-predicate-in-clojure
|
||||
"Returns indexes of elements passing predicate"
|
||||
[pred coll]
|
||||
(keep-indexed #(when (pred %2) %1) coll))
|
||||
(def boards-enabled-cache
|
||||
(atom nil))
|
||||
|
||||
(defn new-guid-always
|
||||
"Generates always unique GUID for Feed item.
|
||||
|
@ -51,12 +48,14 @@
|
|||
(defn filter-chod-posts
|
||||
"Return list of all threads with equal or higher ChoD than requested
|
||||
|
||||
READS FROM GLOBALS: watcher.time-of-cache" ;Todo: best thing would be to add timestamp to cache
|
||||
[query-vec chod-treshold repeat? cache]
|
||||
(let [time-of-generation @watcher/time-of-cache
|
||||
READS FROM GLOBALS: watcher.time-of-cache"
|
||||
[query-vec chod-treshold repeat? board-cache]
|
||||
|
||||
(let [{time-of-generation :time
|
||||
cache :data} board-cache
|
||||
guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
|
||||
update-only-guid)
|
||||
cache-start-index (first (indices (fn [x] (>= (:chod x) chod-treshold))
|
||||
cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold))
|
||||
cache))
|
||||
;; So we don't have to search thru everything we have cached
|
||||
needed-cache-part (subvec cache cache-start-index)
|
||||
|
@ -66,7 +65,7 @@
|
|||
;; Would be so much easier for user to figure out why is it showing
|
||||
;; and it would solve the problem of super long titles (or OPs instead of titles)
|
||||
(when (some (fn [querry]
|
||||
(s/includes? title querry))
|
||||
(s/includes? (s/lower-case title) (s/lower-case querry)))
|
||||
query-vec)
|
||||
t)))
|
||||
(reverse needed-cache-part))]
|
||||
|
@ -76,9 +75,9 @@
|
|||
(defn thread-to-rss-item
|
||||
"If I wasnt retarded I could have made the cached version look like
|
||||
rss item already but what can you do. I'll refactor I promise, I just need this done ASAP" ;Todo: do what the docstring says
|
||||
[t]
|
||||
[t] ;TODO: oh Luna the hardcodes ;;RESUME
|
||||
(let [link-url (str "https://boards.4chan.org/mlp/thread/" (:no t))] ; jesus, well I said only /mlp/ is supported now so fuck it
|
||||
{:title (format "%.2f%% - %s" (:chod t) (:title t))
|
||||
{:title (format "%.2f%% - %s" (:chod t) (:title t)) ;TODO: Generate link from the target somehow, or just include it from API response
|
||||
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
|
||||
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
|
||||
:link link-url
|
||||
|
@ -88,7 +87,7 @@
|
|||
"Generates feed from matching items"
|
||||
[query-vec chod-treshold repeat? cache]
|
||||
(let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
|
||||
head {:title "RSS Thread watcher v0.1"
|
||||
head {:title "RSS Thread watcher v0.4" ;TODO: hardcoded string here, remake to reference to config.clj
|
||||
:link "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||
:feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||
:description "RSS based thread watcher"}
|
||||
|
@ -100,9 +99,11 @@
|
|||
|
||||
READS FROM GLOBALS:
|
||||
rss-thread-watch.watcher.chod-threads-cache
|
||||
rss-thread-watch.core.CONFIG"
|
||||
rss-thread-watch.watcher.GLOBAL-CONFIG" ;TODO: Update if it really reads from there anymore
|
||||
[rqst]
|
||||
(try (let [{{chod "chod" :or {chod "94"}
|
||||
(try (let [{{chod "chod"
|
||||
board "board" :or {chod "94"
|
||||
board (get @watcher/GLOBAL-CONFIG :default-board)}
|
||||
:as prms} :params
|
||||
uri :uri} rqst
|
||||
qrs (prms "q")
|
||||
|
@ -113,19 +114,23 @@
|
|||
chod)]
|
||||
(try ;If we can't parse number from chod, use default 94
|
||||
(if (or (vector? chod)
|
||||
(<= (Integer/parseInt chod) 60)) ; Never accept chod lower that 60 TODO: don't hardcode this
|
||||
(<= (Integer/parseInt chod) 60)) ; Never accept chod lower than 60 TODO: don't hardcode this
|
||||
60 (Integer/parseInt chod))
|
||||
(catch Exception e
|
||||
94)))
|
||||
cache @watcher/chod-threads-cache]
|
||||
;; (println "RCVD: " rqst)
|
||||
(println rqst)
|
||||
(println "\n\nRCVD: " rqst)
|
||||
;; (println rqst)
|
||||
;; ====== Errors =====
|
||||
;; Something other than feed.xml requested
|
||||
(when-not (s/ends-with? uri "feed.xml")
|
||||
(throw (ex-info "404" {:status 404
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body "404 This server has nothing but /feed.xml"})))
|
||||
(when-not (contains? @boards-enabled-cache board)
|
||||
(throw (ex-info "403" {:status 403
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body (get @watcher/GLOBAL-CONFIG :board-disabled-message)})))
|
||||
;; No url params -> we redirect to documentation about params
|
||||
(when (empty? prms)
|
||||
(throw (ex-info "302"
|
||||
|
@ -149,13 +154,15 @@
|
|||
;; There shouldn't be any problems with this mime type but if there are
|
||||
;; replace with "text/xml", or even better, get RSS reader that is not utter shit
|
||||
:header {"Content-Type" "application/rss+xml"}
|
||||
:body (generate-feed queries real-chod repeat? cache)})
|
||||
:body (generate-feed queries real-chod repeat? (watcher/get-thread-data board @watcher/GLOBAL-CONFIG))})
|
||||
(catch Exception e
|
||||
;; Ex-info has been crafted to match HTTP response body so we can send it
|
||||
(if-let [caught (ex-data e)]
|
||||
caught ;We have custom crafted error
|
||||
(do
|
||||
(print "WTF??: " e)
|
||||
{:status 500 ;Something else fucked up, we print what happened
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body (str "500 - Something fucked up while generating feed, If you decide to report it, please include url adress you used:\n"
|
||||
(ex-cause e) "\n"
|
||||
e)}))))
|
||||
e)})))))
|
||||
|
|
101
src/rss_thread_watch/utils.clj
Normal file
101
src/rss_thread_watch/utils.clj
Normal file
|
@ -0,0 +1,101 @@
|
|||
;; Copyright (C) 2024 Felisp
|
||||
;;
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU Affero General Public License as published by
|
||||
;; the Free Software Foundation, version 3 of the License.
|
||||
;;
|
||||
;; This program is distributed in the hope that it will be useful,
|
||||
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
;; GNU Affero General Public License for more details.
|
||||
;;
|
||||
;; You should have received a copy of the GNU Affero General Public License
|
||||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
(ns rss-thread-watch.utils
|
||||
"Util functions"
|
||||
(:gen-class))
|
||||
|
||||
;; ===== Macros =====
|
||||
(defmacro nil?-else
|
||||
"Return x unless it's nil, the return y"
|
||||
[x y]
|
||||
`(let [result# ~x]
|
||||
(if (nil? result#)
|
||||
~y
|
||||
result#)))
|
||||
|
||||
(defmacro when-else
|
||||
"Evaluates [tst], if it's truthy value returns that value.
|
||||
If it's not, execute everything in [else] and return last expr."
|
||||
[tst & else]
|
||||
`(let [res# ~tst]
|
||||
(if res#
|
||||
res#
|
||||
(do ~@else))))
|
||||
|
||||
(defmacro ret=
|
||||
"compares two values using [=]. If the result is true
|
||||
returns the value, else the result of [=].
|
||||
|
||||
Usefull with if-else"
|
||||
[x y]
|
||||
`(let [x# ~x
|
||||
y# ~y
|
||||
result# ~(= x y)]
|
||||
(if result#
|
||||
~x
|
||||
result#)))
|
||||
|
||||
;; ===== Generic functions ====
|
||||
|
||||
(defn indices
|
||||
;; https://stackoverflow.com/questions/8641305/find-index-of-an-element-matching-a-predicate-in-clojure
|
||||
"Returns indexes of elements passing predicate"
|
||||
[pred coll]
|
||||
(keep-indexed #(when (pred %2) %1) coll))
|
||||
|
||||
(defn map-apply-defaults
|
||||
"Apply default values from [defaults] to keys not present in [conf]
|
||||
Order is very important.
|
||||
Thus all missing values from config are replaced by defaults"
|
||||
[conf defaults]
|
||||
(into conf
|
||||
(for [k (keys defaults)]
|
||||
(let [conf-val (get conf k)
|
||||
default-val (get defaults k)]
|
||||
(if (and (map? conf-val) ; both are maps, we have to go level deeper
|
||||
(map? default-val)) ; If only one is, we don't care cus then it's just assigment
|
||||
{k (map-apply-defaults conf-val default-val)}
|
||||
{k (nil?-else conf-val default-val)})))))
|
||||
|
||||
(defn fmap
|
||||
"Applies function [f] to every key and value in map [m]
|
||||
Function signature should be (f [key value])."
|
||||
[f m]
|
||||
(into
|
||||
(empty m)
|
||||
(for [[key val] m]
|
||||
[key (f key val)])))
|
||||
|
||||
(defn expand-home
|
||||
"Expands ~ to home directory"
|
||||
;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for
|
||||
[s]
|
||||
(if (clojure.string/starts-with? s "~")
|
||||
(clojure.string/replace-first s "~" (System/getProperty "user.home"))
|
||||
s))
|
||||
|
||||
(defn expand-path
|
||||
[s]
|
||||
(if (clojure.string/starts-with? s "./")
|
||||
(clojure.string/replace-first s "." (System/getProperty "user.dir"))
|
||||
(expand-home s)))
|
||||
|
||||
(defn file-exists?
|
||||
"Returns true if file exists"
|
||||
[file]
|
||||
(let [path (if (vector? file)
|
||||
(first file)
|
||||
file)]
|
||||
(.exists (clojure.java.io/file (expand-path path)))))
|
|
@ -1,4 +1,4 @@
|
|||
;; Copyright (C) 2023 Felisp
|
||||
;; Copyright (C) 2024 Felisp
|
||||
;;
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU Affero General Public License as published by
|
||||
|
@ -18,11 +18,23 @@
|
|||
[clojure.data.json :as js])
|
||||
(:gen-class))
|
||||
|
||||
(def chod-threads-cache
|
||||
"Cached vector of threads that have CHanceOfDeath > configured"
|
||||
(atom []))
|
||||
(def GLOBAL-CONFIG
|
||||
"Global config with defaults for missing entires"
|
||||
;; I know globals are ew in Clojure but I don't know any
|
||||
;; better way of doing this
|
||||
(atom nil))
|
||||
|
||||
(def time-of-cache (atom 0))
|
||||
(def chod-threads-cache
|
||||
"Cached map of threads that have CHanceOfDeath > configured"
|
||||
(atom {}))
|
||||
|
||||
(defn generate-chod-cache-structure
|
||||
"Generates initial structure for global cache
|
||||
Structure is returned, you have to set it yourself"
|
||||
[config]
|
||||
(let [ks (keys (:boards-enabled config))]
|
||||
(zipmap ks
|
||||
(repeatedly (count ks) #(atom nil)))))
|
||||
|
||||
(defn process-page
|
||||
"Procesess every thread in page, leaving only relevant information
|
||||
|
@ -44,27 +56,64 @@
|
|||
(defn build-cache
|
||||
"Build cache of near-death threads so the values don't have to be recalculated on each request."
|
||||
[pages-to-index pages-total threads-per-page threads-total]
|
||||
(vec (flatten (map (fn [single-page]
|
||||
{:time (System/currentTimeMillis)
|
||||
:data (vec (flatten (map (fn [single-page]
|
||||
;; We have to (dec page-number) bcs otherwise we would get the total number of threads
|
||||
;; including the whole page of threads
|
||||
(let [page-number (dec (:page single-page))] ; inc to get to the actuall page
|
||||
(process-page (:threads single-page) threads-total (inc (* page-number threads-per-page)))))
|
||||
pages-to-index))))
|
||||
pages-to-index)))})
|
||||
|
||||
(defn update-thread-cache!
|
||||
(defn update-board-cache!
|
||||
"Updates cache of near-death threads. Writes to chod-threads-cache as side effect.
|
||||
[url] - Url to download data from
|
||||
[starting-page] - From which page consider threads to be fit for near-death cache"
|
||||
[url starting-page]
|
||||
;; Todo: surround with try so we can timeout and other stuff
|
||||
[board] - Board to assign cached data to, it's existence is NOT checked here
|
||||
[starting-page] - From which page consider threads to be fit for near-death cache
|
||||
THIS FUNCTION WRITES TO chod-threads-cache
|
||||
Returns :data part of [board] cache"
|
||||
[url board starting-page]
|
||||
;; Todo: surround with try so we can timeout, 40x and other stuff
|
||||
(let [catalog (with-open [readr (io/reader url)]
|
||||
(js/read readr :key-fn keyword))
|
||||
pages-total (count catalog)
|
||||
;; universal calculation for total number of threads:
|
||||
;; (pages-total -1) * threadsPerPage + threadsOnLastpage ;;accounts for boards which have stickied threads making them have 11pages
|
||||
threads-per-page (count (:threads (first catalog)))
|
||||
threads-per-page (count (:threads (first catalog))) ;; TODO: last could be remade to peek if it's a vector
|
||||
threads-total (+ (* threads-per-page (dec pages-total)) (count (:threads (last catalog)))) ;; Todo: Yeah, maybe this calculation could be refactored into let
|
||||
to-index (filter (fn [item]
|
||||
(<= starting-page (:page item))) catalog)]
|
||||
(reset! chod-threads-cache (build-cache to-index pages-total threads-per-page threads-total))
|
||||
(reset! time-of-cache (System/currentTimeMillis))))
|
||||
;; TODO: there absolutely must be try catch for missing - not enabled boards,
|
||||
;; This is probably resolved now, but keeping it just in case
|
||||
;; This will return nill and that fuck everything up
|
||||
(println "Refreshed cache for " board)
|
||||
(reset! (get @chod-threads-cache board)
|
||||
(build-cache to-index pages-total threads-per-page threads-total))))
|
||||
|
||||
(defn board-enabled?
|
||||
"Checks whether board is enabled in config"
|
||||
[board config]
|
||||
(contains? board (keys (get config :boards-enabled))))
|
||||
|
||||
(defn get-board-url
|
||||
"Gets board url from :target if "
|
||||
[board config]
|
||||
;; TODO: jesus, this needs sanitization and should be probably crafted by some URL class
|
||||
(str (get-in config [:boards-enabled board :target]) board "catalog.json"))
|
||||
|
||||
(defn get-thread-data
|
||||
"Gets thread cache for given board.
|
||||
If board is lazy loaded, downloads new one if needed.
|
||||
|
||||
MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY"
|
||||
[board config]
|
||||
(let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate)))
|
||||
{data :data
|
||||
time-downloaded :time
|
||||
:or {time-downloaded 0}
|
||||
:as board-atom } @(get @chod-threads-cache board)
|
||||
;; TODO: This also makes it implictly lazy-load -> if disabled make the check here
|
||||
time-to-update? (or (nil? board-atom)
|
||||
(> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))]
|
||||
(if time-to-update?
|
||||
(update-board-cache! (get-board-url board config) board (get-in config [:boards-enabled board :starting-page]))
|
||||
@(get @chod-threads-cache board))))
|
||||
|
|
67
test/rss_thread_watch/utils_test.clj
Normal file
67
test/rss_thread_watch/utils_test.clj
Normal file
|
@ -0,0 +1,67 @@
|
|||
;; Copyright (C) 2024 Felisp
|
||||
;;
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU Affero General Public License as published by
|
||||
;; the Free Software Foundation, version 3 of the License.
|
||||
;;
|
||||
;; This program is distributed in the hope that it will be useful,
|
||||
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
;; GNU Affero General Public License for more details.
|
||||
;;
|
||||
;; You should have received a copy of the GNU Affero General Public License
|
||||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
(ns rss-thread-watch.utils-test
|
||||
(:require [clojure.test :refer :all]
|
||||
[rss-thread-watch.utils :refer :all]))
|
||||
|
||||
(def first-map
|
||||
"Example config map with two keys"
|
||||
{:a :b
|
||||
:c "c"
|
||||
:nested {:fst 1 :scnd {:super :nested}}})
|
||||
|
||||
(def pony-map
|
||||
"Map containing none of the items in map 1"
|
||||
{:best-pony "Twilight Sparkle"})
|
||||
|
||||
(def conflicting-basic-merge (conj pony-map {:a 17 :c 15}))
|
||||
|
||||
(def deep-pony-map {:a "x"
|
||||
:c :something-else
|
||||
:nested {:ponies "everywhere"
|
||||
:fst 69}})
|
||||
|
||||
(def empty-map {})
|
||||
|
||||
(deftest map-apply-defaults-test
|
||||
(testing "Full and no-replace"
|
||||
(is (= first-map (map-apply-defaults first-map empty-map))
|
||||
"No defaults should return conf map unchanged")
|
||||
(is (= first-map (map-apply-defaults empty-map first-map))
|
||||
"Empty map should be completely replaced by defaults"))
|
||||
|
||||
(testing "Basic merge"
|
||||
(is (= (conj pony-map first-map) (map-apply-defaults first-map pony-map))
|
||||
"When all keys unique, maps should be conjd")
|
||||
(is (= (conj first-map pony-map) (map-apply-defaults first-map pony-map))
|
||||
"When all keys unique, maps should be conjd, order matters")
|
||||
(is (= (conj first-map pony-map) (map-apply-defaults pony-map first-map))
|
||||
"When all keys unique, maps should be conjd, more order that matters")
|
||||
(is (= (conj first-map pony-map) (map-apply-defaults first-map pony-map))
|
||||
"Conflicting basic merge"))
|
||||
;; Most important part, this is the reason we have the function in the first place
|
||||
;; Conj wont merge deep
|
||||
(testing "Nested merge"
|
||||
(is (= {:a :b
|
||||
:c "c"
|
||||
:nested {:ponies "everywhere"
|
||||
:fst 1
|
||||
:scnd {:super :nested}}}
|
||||
(map-apply-defaults first-map deep-pony-map)))))
|
||||
|
||||
(deftest fmap-test
|
||||
(testing "Applying function to values of map"
|
||||
(is (= {:a 2 :b 3} (fmap (fn [k v] (inc v))
|
||||
{:a 1 :b 2})))))
|
Loading…
Reference in a new issue