Merge release Beta 1 into stable #21
8 changed files with 444 additions and 96 deletions
49
README.org
49
README.org
|
@ -1,7 +1,7 @@
|
||||||
#+OPTIONS: toc:nil
|
#+OPTIONS: toc:nil
|
||||||
* RSS based thread watcher
|
* RSS based thread watcher
|
||||||
|
|
||||||
Get notifications from your feed reader when your favourite /mlp/ thread is about to die
|
Get notifications from your feed reader when your favourite thread is about to die
|
||||||
|
|
||||||
** Usage
|
** Usage
|
||||||
|
|
||||||
|
@ -24,11 +24,14 @@ Right now there is no automated way to generate your feed url but making one by
|
||||||
|
|
||||||
**** URL parameters
|
**** URL parameters
|
||||||
|
|
||||||
|
Please note that default values may vary depending on which host you use, these are the defaults that come with this software but
|
||||||
|
anyone running instance of RSS thread watcher can change them
|
||||||
|
|
||||||
| Param name | Values [default] | Can have multiple? | Mandatory? | Short description |
|
| Param name | Values [default] | Can have multiple? | Mandatory? | Short description |
|
||||||
|------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------|
|
|------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------|
|
||||||
| board | "mlp" | No | No (not implemented) | Which board to generate feed for, *ONLY* /mlp/ is supported |
|
| board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work |
|
||||||
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles |
|
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet |
|
||||||
| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death i > chod |
|
| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod |
|
||||||
| repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod |
|
| repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod |
|
||||||
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
|
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
|
||||||
|
|
||||||
|
@ -50,62 +53,54 @@ Standart rules of URLs apply, if you know how to pass params in URL to any websi
|
||||||
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
|
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
|
||||||
- Note that ~//~ are not special characters ~q=/general/~ will work as expected and match thread with "/general/" in it's title
|
- Note that ~//~ are not special characters ~q=/general/~ will work as expected and match thread with "/general/" in it's title
|
||||||
- Also note that regex is *NOT* supported for now, so something like ~q=rainbow*~ will only match threads with "rainbow" followed
|
- Also note that regex is *NOT* supported for now, so something like ~q=rainbow*~ will only match threads with "rainbow" followed
|
||||||
immedidatelly by ~*~
|
immedidatelly by ~*~ in their title
|
||||||
in their title
|
|
||||||
|
|
||||||
*** Generating URL interactively
|
*** Generating URL interactively
|
||||||
|
|
||||||
Coming soon
|
Coming soon (not really)
|
||||||
|
|
||||||
** Limitations
|
** Bugs
|
||||||
|
|
||||||
This is an experimental project. There are several limitations:
|
See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues?q=&type=all&state=open&labels=1&milestone=0&assignee=0&poster=0][issues]]
|
||||||
- Only supported board is /mlp/ (You can choose your own when self hosting)
|
|
||||||
- Only searched threads are those who are in the 50% closer to death part of the catalog
|
|
||||||
|
|
||||||
*** Bugs
|
|
||||||
|
|
||||||
See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues][issues]]
|
|
||||||
|
|
||||||
** Feature set
|
** Feature set
|
||||||
|
|
||||||
- Planned/finnished features [23%]
|
- Planned/finnished features [38%]
|
||||||
- [X] [DONE] Super basic features done (feed, query, repeat)
|
- [X] [DONE] Super basic features done (feed, query, repeat)
|
||||||
- [X] Have proper sorting - The most likely to die threads first
|
- [X] Have proper sorting - The most likely to die threads first
|
||||||
- [X] No params request should redirect to url generator or (for now) documentation
|
- [X] No params request should redirect to url generator or (for now) documentation
|
||||||
- [ ] Config file instead of hardcoding config values
|
- [X] Config file instead of hardcoding config values
|
||||||
- [ ] Include time of latest data fetch
|
- [ ] Include time of latest data fetch
|
||||||
- [ ] Make threads have preview images taken from the actuall thread OP
|
- [ ] Make threads have preview images taken from the actuall thread OP
|
||||||
- [ ] Show which query matched the thread you were notified of
|
- [ ] Show which query matched the thread you were notified of
|
||||||
- [ ] Option to include advanced HTML formating of text (different color text for ChoD etc)
|
- [ ] Option to include advanced HTML formating of text (different color text for ChoD etc)
|
||||||
- [ ] Support notification on watched thread re-creation after it died
|
- [ ] Support notification on watched thread re-creation after it died
|
||||||
- [ ] Support notification for thread death
|
- [ ] Support notification for thread death
|
||||||
- [ ] Support multiple boards at once
|
- [X] Support multiple boards at once
|
||||||
- [ ] Support async responses
|
- [ ] Support async responses
|
||||||
- [ ] Graal VM support for native configuration
|
- [ ] Graal VM support for native configuration
|
||||||
|
|
||||||
** Self hosting
|
** Self hosting
|
||||||
|
|
||||||
This is not supported until release 1.0. You can do it if you figure it out (probably not that hard tbh) but there will be much
|
As of first Beta release, self hosting is supported, please refer to [[file:res/ExampleConfig-documented.edn][documented example config]] for infomration on configuration
|
||||||
more detailed instructions in the future.
|
options.
|
||||||
|
|
||||||
*** Prebuilt
|
*** Prebuilt
|
||||||
|
|
||||||
There will be instructions at some point I promise. Until then you can download binaries from the releases page and run them like
|
Download newest release from [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/releases][releases]] and run them like you would any other java executable, default port is ~6969~
|
||||||
you would any other java executable, default port is ~6969~.
|
|
||||||
|
|
||||||
And you need Java for now if that isn't clear.
|
|
||||||
|
|
||||||
~$ java -jar whatEverNameTheReleaseHas.jar~~
|
~$ java -jar whatEverNameTheReleaseHas.jar~~
|
||||||
|
|
||||||
*** From source
|
*** From source
|
||||||
|
Not officially supported, if you'll attempt this, please, use source from release tarball or checkout ~release~ or ~stable~
|
||||||
|
branch. ~dev~ branch is unstable and untested, may not even build. ~stable~ branch should always build, may contain newer version
|
||||||
|
than is released.
|
||||||
|
|
||||||
If you know Clojure, then just clone and build with lein. If you don't either RTFM to lein or wait before instructions will be
|
If you know Clojure, then just clone and build with lein. If you don't either RTFM for lein or wait before instructions will be
|
||||||
avaiabile here.
|
avaiabile here.
|
||||||
|
|
||||||
*** Configuring
|
*** Configuring
|
||||||
|
|
||||||
Self hosting is not supported at the moment so no configuration for you.
|
All documentation is for now included in [[file:res/ExampleConfig-documented.edn][documented exmample config]].
|
||||||
|
|
||||||
*** Contributing
|
*** Contributing
|
||||||
|
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
(defproject rss-thread-watch "0.1.0-SNAPSHOT"
|
(defproject rss-thread-watch "0.4.0-SNAPSHOT"
|
||||||
:description "RSS based thread watcher"
|
:description "RSS based thread watcher"
|
||||||
:url "http://example.com/FIXME"
|
:url "http://example.com/FIXME"
|
||||||
:license {:name "AGPL-3.0-only"
|
:license {:name "AGPL-3.0-only"
|
||||||
|
@ -7,7 +7,8 @@
|
||||||
[ring/ring-core "1.8.2"]
|
[ring/ring-core "1.8.2"]
|
||||||
[ring/ring-jetty-adapter "1.8.2"]
|
[ring/ring-jetty-adapter "1.8.2"]
|
||||||
[clj-rss "0.4.0"]
|
[clj-rss "0.4.0"]
|
||||||
[org.clojure/data.json "2.4.0"]]
|
[org.clojure/data.json "2.4.0"]
|
||||||
|
[org.clojure/tools.cli "1.1.230"]]
|
||||||
:main ^:skip-aot rss-thread-watch.core
|
:main ^:skip-aot rss-thread-watch.core
|
||||||
:target-path "target/%s"
|
:target-path "target/%s"
|
||||||
:profiles {:uberjar {:aot :all}})
|
:profiles {:uberjar {:aot :all}})
|
||||||
|
|
47
res/ExampleConfig-documented.edn
Normal file
47
res/ExampleConfig-documented.edn
Normal file
|
@ -0,0 +1,47 @@
|
||||||
|
{:port 6969 ;Port to listen on
|
||||||
|
:default-board "/mlp/" ;Board to be used when no board=x param given
|
||||||
|
;; Message displayed when requested board is not enabled
|
||||||
|
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]"
|
||||||
|
;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED
|
||||||
|
|
||||||
|
;; This map defines default values for all enabled boards, if you wish for some board
|
||||||
|
;; to use different values, specify them bellow in :borads-enabled
|
||||||
|
:boards-defaults {
|
||||||
|
;; After how many seconds get fresh catalog.json from :target
|
||||||
|
:refresh-rate 300
|
||||||
|
;; Page from which to start indexing threads, threads on pages with lower
|
||||||
|
;; numbers will not be detectable by the feed watcher
|
||||||
|
:starting-page 7
|
||||||
|
;; Default ChOD to use if none is specified by the user
|
||||||
|
:default-chod 94
|
||||||
|
;; If you want to do some preprocessing beforehand, you can override
|
||||||
|
;; Target URL for the board, but the response must be same the 4chan API would return
|
||||||
|
;; /$board/catalog.json will be appended to this link
|
||||||
|
:target "https://api.4chan.org"
|
||||||
|
;; Commented parts bellow are still unimplemented
|
||||||
|
;; ------
|
||||||
|
;; Only download catalog when someone requests feed and cache is old
|
||||||
|
;; Saves requests to 4chan, usefull for boards that are checked rarely
|
||||||
|
;; Generally the better option, first request in taken in :refresh-rate may take longer
|
||||||
|
;; Currently the only option
|
||||||
|
:lazy-load true
|
||||||
|
;; Whether to allow regex search thru the threads (&qr= param) UNIMPLEMENTED
|
||||||
|
;; :regex-enable true
|
||||||
|
;; Wheter to create cache by downloading whole catalog or every required
|
||||||
|
;; page one by one UNIMPLEMENTED
|
||||||
|
;; :request-type [:catalog] :pages
|
||||||
|
}
|
||||||
|
;; List of all boards that are enabled for feed generation
|
||||||
|
;; Yes they must be all listed manualy for now
|
||||||
|
;; Each such board must have map of altered config options if aplicable
|
||||||
|
;; otherwise empty one must be provided
|
||||||
|
:boards-enabled {"/mlp/" {} ;; Empty override map means that defaults are used
|
||||||
|
;; This means that board "/g/" will have :starting-page set to 7 but all
|
||||||
|
;; the other config options are copied from :board-defaults
|
||||||
|
"/g/" {:starting-page 7}
|
||||||
|
"/po/" {:starting-page 8
|
||||||
|
:refresh-rate 86400} ;1 day
|
||||||
|
"/p/" {:starting-page 8
|
||||||
|
:refresh-rate 1800} ;30 min
|
||||||
|
}
|
||||||
|
}
|
|
@ -1,4 +1,4 @@
|
||||||
;; Copyright (C) 2023 Felisp
|
;; Copyright (C) 2024 Felisp
|
||||||
;;
|
;;
|
||||||
;; This program is free software: you can redistribute it and/or modify
|
;; This program is free software: you can redistribute it and/or modify
|
||||||
;; it under the terms of the GNU Affero General Public License as published by
|
;; it under the terms of the GNU Affero General Public License as published by
|
||||||
|
@ -13,50 +13,131 @@
|
||||||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||||
|
|
||||||
(ns rss-thread-watch.core
|
(ns rss-thread-watch.core
|
||||||
(:require [ring.adapter.jetty :as jetty]
|
(:require [clojure.java.io :as io]
|
||||||
|
[clojure.edn :as edn]
|
||||||
|
[clojure.tools.cli :refer [parse-opts]]
|
||||||
|
[ring.adapter.jetty :as jetty]
|
||||||
[ring.middleware.params :as rp]
|
[ring.middleware.params :as rp]
|
||||||
[rss-thread-watch.watcher :as watcher]
|
[rss-thread-watch.watcher :as watcher]
|
||||||
[rss-thread-watch.feed-generator :as feed])
|
[rss-thread-watch.feed-generator :as feed]
|
||||||
|
[rss-thread-watch.utils :as u])
|
||||||
(:gen-class))
|
(:gen-class))
|
||||||
|
|
||||||
;; Internal default config
|
(def VERSION "0.4.0")
|
||||||
(def CONFIG
|
|
||||||
"Internal default config"
|
|
||||||
{:target "https://api.4chan.org/mlp/catalog.json" ;Where to download catalog from
|
|
||||||
:starting-page 7 ;only monitor threads from this from this page and up
|
|
||||||
:refresh-delay (* 60 5) ;Redownload catalog every 5 mins
|
|
||||||
:port 6969 ;Listen on 6969
|
|
||||||
})
|
|
||||||
|
|
||||||
|
;; Internal default config
|
||||||
|
(def CONFIG-DEFAULT
|
||||||
|
"Internal default config"
|
||||||
|
{:port 6969
|
||||||
|
:default-board "/mlp/"
|
||||||
|
:enable-board-listing true
|
||||||
|
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
|
||||||
|
:boards-defaults {:refresh-rate 300
|
||||||
|
:starting-page 7
|
||||||
|
:default-chod 94
|
||||||
|
:target "https://api.4chan.org"
|
||||||
|
:lazy-load true}
|
||||||
|
:boards-enabled {"/mlp/" {:lazy-load false}
|
||||||
|
"/g/" {:starting-page 7}
|
||||||
|
"/po/" {:starting-page 8
|
||||||
|
:refresh-rate 86400}
|
||||||
|
"/p/" {:starting-page 8
|
||||||
|
:refresh-rate 1800}}})
|
||||||
|
|
||||||
|
(def cli-options
|
||||||
|
"Configuration defining program arguments for cli.tools"
|
||||||
|
[["-v" "--version" "Print version and license information"]
|
||||||
|
["-h" "--help" "Prints help"]
|
||||||
|
["-c" "--config CONFIG_FILE" "Specify config file to use for this run"
|
||||||
|
:default "./config.edn"
|
||||||
|
:validate [#(u/file-exists? %) "Specified config file does not exist or is not readable"]]
|
||||||
|
[nil "--print-default-config" "Prints internal default config file to STDOUT and exits"]])
|
||||||
|
|
||||||
|
;; Todo: Think of a way to start repeated download for every catalog efficiently
|
||||||
(defn set-interval
|
(defn set-interval
|
||||||
"Calls function every ms"
|
"Calls function every ms"
|
||||||
|
^{:deprecated true}
|
||||||
[callback ms]
|
[callback ms]
|
||||||
(future (while true (do (try
|
(future (while true (do (try
|
||||||
(callback)
|
(callback)
|
||||||
(println "Recached")
|
(println "Recached")
|
||||||
(catch Exception e
|
(catch Exception e
|
||||||
(binding [*out* *err*]
|
(binding [*out* *err*]
|
||||||
(println "Error while updating cache: " e ", retrying in 5 minutes"))))
|
(println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes"))))
|
||||||
(Thread/sleep ms)))))
|
(Thread/sleep ms)))))
|
||||||
|
|
||||||
|
(defn load-config
|
||||||
|
"Attempts to load config from file [f].
|
||||||
|
Returns loaded config map or nil if failed"
|
||||||
|
[f]
|
||||||
|
(let [fl (io/as-file f)]
|
||||||
|
(when (.exists fl)
|
||||||
|
(with-open [r (io/reader fl)]
|
||||||
|
(edn/read (java.io.PushbackReader. r))))))
|
||||||
|
|
||||||
|
(defn config-fill-board-defaults
|
||||||
|
"Fills every enabled board with default config values"
|
||||||
|
[config]
|
||||||
|
(let [defaults (:boards-defaults config)]
|
||||||
|
(dissoc (update-in config
|
||||||
|
'(:boards-enabled)
|
||||||
|
(fn [mp]
|
||||||
|
(u/fmap (fn [k v]
|
||||||
|
(u/map-apply-defaults v defaults))
|
||||||
|
mp)))
|
||||||
|
:boards-defaults)))
|
||||||
|
|
||||||
|
(defn get-some-config
|
||||||
|
"Attempts to get config somehow,
|
||||||
|
first from [custom-file], if it's nil,
|
||||||
|
then from ./config.edn file.
|
||||||
|
If is neither exists, default internal one is used."
|
||||||
|
[custom-file]
|
||||||
|
(config-fill-board-defaults
|
||||||
|
;; TODO: There has to be try/catch for when file is invalid edn
|
||||||
|
;; This is gonna be done when config validation comes in Beta 2
|
||||||
|
(let [file-to-try (u/nil?-else custom-file
|
||||||
|
"./config.edn")]
|
||||||
|
(u/when-else (load-config file-to-try)
|
||||||
|
CONFIG-DEFAULT))))
|
||||||
|
|
||||||
(defn -main
|
(defn -main
|
||||||
"Entry point, starts webserver"
|
"Entry point, starts webserver"
|
||||||
[& args]
|
[& args]
|
||||||
(println "Starting on port: " (:port CONFIG)
|
(let [parsed-args (parse-opts args cli-options)
|
||||||
"\nGonna recache every: " (:refresh-delay CONFIG) "s")
|
options (get parsed-args :options)]
|
||||||
(set-interval (fn []
|
(when-let [err (get parsed-args :errors)]
|
||||||
(println "Starting cache update")
|
(println "Error: " err)
|
||||||
(watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG)))
|
(System/exit 1))
|
||||||
(* 1000 (:refresh-delay CONFIG)))
|
(when (get options :version)
|
||||||
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG)
|
(println "RSS Thread Watcher " VERSION " Licensed under AGPL-3.0-only")
|
||||||
:join? true}))
|
(System/exit 0))
|
||||||
|
(when (get options :help)
|
||||||
|
(println "RSS Thread Watcher help:\n" (get parsed-args :summary))
|
||||||
|
(System/exit 0))
|
||||||
|
(when (get options :print-default-config)
|
||||||
|
(println ";;Default internal config file from RSS Thread Watcher " VERSION)
|
||||||
|
(clojure.pprint/pprint CONFIG-DEFAULT)
|
||||||
|
;; In case someone was copying by hand, this might be useful
|
||||||
|
(println ";;END of Default internal config file")
|
||||||
|
(System/exit 0))
|
||||||
|
|
||||||
|
(let [config (get-some-config (:config options))]
|
||||||
|
;; TODO: probably refactor to use separate config.clj file when validation will be added
|
||||||
|
;; Init the few globals we have
|
||||||
|
(reset! watcher/GLOBAL-CONFIG config)
|
||||||
|
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
|
||||||
|
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config))
|
||||||
|
(clojure.pprint/pprint config)
|
||||||
|
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG-DEFAULT)
|
||||||
|
:join? true}))))
|
||||||
|
|
||||||
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
|
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
|
||||||
(defn repl-main
|
(defn repl-main
|
||||||
"Development entry point"
|
"Development entry point"
|
||||||
[]
|
[]
|
||||||
(jetty/run-jetty (rp/wrap-params #'feed/http-handler)
|
(jetty/run-jetty (rp/wrap-params #'feed/http-handler)
|
||||||
{:port (:port CONFIG)
|
{:port (:port CONFIG-DEFAULT)
|
||||||
;; Dont block REPL thread
|
;; Dont block REPL thread
|
||||||
:join? false}))
|
:join? false}))
|
||||||
;; (repl-main)
|
;; (repl-main)
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
;; Copyright (C) 2023 Felisp
|
;; Copyright (C) 2024 Felisp
|
||||||
;;
|
;;
|
||||||
;; This program is free software: you can redistribute it and/or modify
|
;; This program is free software: you can redistribute it and/or modify
|
||||||
;; it under the terms of the GNU Affero General Public License as published by
|
;; it under the terms of the GNU Affero General Public License as published by
|
||||||
|
@ -18,15 +18,12 @@
|
||||||
[ring.util.response :as response]
|
[ring.util.response :as response]
|
||||||
[clj-rss.core :as rss]
|
[clj-rss.core :as rss]
|
||||||
[clojure.string :as s]
|
[clojure.string :as s]
|
||||||
[rss-thread-watch.watcher :as watcher])
|
[rss-thread-watch.watcher :as watcher]
|
||||||
|
[rss-thread-watch.utils :as ut])
|
||||||
(:gen-class))
|
(:gen-class))
|
||||||
|
|
||||||
|
(def boards-enabled-cache
|
||||||
(defn indices
|
(atom nil))
|
||||||
;; https://stackoverflow.com/questions/8641305/find-index-of-an-element-matching-a-predicate-in-clojure
|
|
||||||
"Returns indexes of elements passing predicate"
|
|
||||||
[pred coll]
|
|
||||||
(keep-indexed #(when (pred %2) %1) coll))
|
|
||||||
|
|
||||||
(defn new-guid-always
|
(defn new-guid-always
|
||||||
"Generates always unique GUID for Feed item.
|
"Generates always unique GUID for Feed item.
|
||||||
|
@ -51,12 +48,14 @@
|
||||||
(defn filter-chod-posts
|
(defn filter-chod-posts
|
||||||
"Return list of all threads with equal or higher ChoD than requested
|
"Return list of all threads with equal or higher ChoD than requested
|
||||||
|
|
||||||
READS FROM GLOBALS: watcher.time-of-cache" ;Todo: best thing would be to add timestamp to cache
|
READS FROM GLOBALS: watcher.time-of-cache"
|
||||||
[query-vec chod-treshold repeat? cache]
|
[query-vec chod-treshold repeat? board-cache]
|
||||||
(let [time-of-generation @watcher/time-of-cache
|
|
||||||
|
(let [{time-of-generation :time
|
||||||
|
cache :data} board-cache
|
||||||
guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
|
guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
|
||||||
update-only-guid)
|
update-only-guid)
|
||||||
cache-start-index (first (indices (fn [x] (>= (:chod x) chod-treshold))
|
cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold))
|
||||||
cache))
|
cache))
|
||||||
;; So we don't have to search thru everything we have cached
|
;; So we don't have to search thru everything we have cached
|
||||||
needed-cache-part (subvec cache cache-start-index)
|
needed-cache-part (subvec cache cache-start-index)
|
||||||
|
@ -66,7 +65,7 @@
|
||||||
;; Would be so much easier for user to figure out why is it showing
|
;; Would be so much easier for user to figure out why is it showing
|
||||||
;; and it would solve the problem of super long titles (or OPs instead of titles)
|
;; and it would solve the problem of super long titles (or OPs instead of titles)
|
||||||
(when (some (fn [querry]
|
(when (some (fn [querry]
|
||||||
(s/includes? title querry))
|
(s/includes? (s/lower-case title) (s/lower-case querry)))
|
||||||
query-vec)
|
query-vec)
|
||||||
t)))
|
t)))
|
||||||
(reverse needed-cache-part))]
|
(reverse needed-cache-part))]
|
||||||
|
@ -76,9 +75,9 @@
|
||||||
(defn thread-to-rss-item
|
(defn thread-to-rss-item
|
||||||
"If I wasnt retarded I could have made the cached version look like
|
"If I wasnt retarded I could have made the cached version look like
|
||||||
rss item already but what can you do. I'll refactor I promise, I just need this done ASAP" ;Todo: do what the docstring says
|
rss item already but what can you do. I'll refactor I promise, I just need this done ASAP" ;Todo: do what the docstring says
|
||||||
[t]
|
[t] ;TODO: oh Luna the hardcodes ;;RESUME
|
||||||
(let [link-url (str "https://boards.4chan.org/mlp/thread/" (:no t))] ; jesus, well I said only /mlp/ is supported now so fuck it
|
(let [link-url (str "https://boards.4chan.org/mlp/thread/" (:no t))] ; jesus, well I said only /mlp/ is supported now so fuck it
|
||||||
{:title (format "%.2f%% - %s" (:chod t) (:title t))
|
{:title (format "%.2f%% - %s" (:chod t) (:title t)) ;TODO: Generate link from the target somehow, or just include it from API response
|
||||||
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
|
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
|
||||||
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
|
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
|
||||||
:link link-url
|
:link link-url
|
||||||
|
@ -88,7 +87,7 @@
|
||||||
"Generates feed from matching items"
|
"Generates feed from matching items"
|
||||||
[query-vec chod-treshold repeat? cache]
|
[query-vec chod-treshold repeat? cache]
|
||||||
(let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
|
(let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
|
||||||
head {:title "RSS Thread watcher v0.1"
|
head {:title "RSS Thread watcher v0.4" ;TODO: hardcoded string here, remake to reference to config.clj
|
||||||
:link "https://tools.treebrary.org/thread-watcher/feed.xml"
|
:link "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||||
:feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
|
:feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||||
:description "RSS based thread watcher"}
|
:description "RSS based thread watcher"}
|
||||||
|
@ -100,9 +99,11 @@
|
||||||
|
|
||||||
READS FROM GLOBALS:
|
READS FROM GLOBALS:
|
||||||
rss-thread-watch.watcher.chod-threads-cache
|
rss-thread-watch.watcher.chod-threads-cache
|
||||||
rss-thread-watch.core.CONFIG"
|
rss-thread-watch.watcher.GLOBAL-CONFIG" ;TODO: Update if it really reads from there anymore
|
||||||
[rqst]
|
[rqst]
|
||||||
(try (let [{{chod "chod" :or {chod "94"}
|
(try (let [{{chod "chod"
|
||||||
|
board "board" :or {chod "94"
|
||||||
|
board (get @watcher/GLOBAL-CONFIG :default-board)}
|
||||||
:as prms} :params
|
:as prms} :params
|
||||||
uri :uri} rqst
|
uri :uri} rqst
|
||||||
qrs (prms "q")
|
qrs (prms "q")
|
||||||
|
@ -113,19 +114,23 @@
|
||||||
chod)]
|
chod)]
|
||||||
(try ;If we can't parse number from chod, use default 94
|
(try ;If we can't parse number from chod, use default 94
|
||||||
(if (or (vector? chod)
|
(if (or (vector? chod)
|
||||||
(<= (Integer/parseInt chod) 60)) ; Never accept chod lower that 60 TODO: don't hardcode this
|
(<= (Integer/parseInt chod) 60)) ; Never accept chod lower than 60 TODO: don't hardcode this
|
||||||
60 (Integer/parseInt chod))
|
60 (Integer/parseInt chod))
|
||||||
(catch Exception e
|
(catch Exception e
|
||||||
94)))
|
94)))
|
||||||
cache @watcher/chod-threads-cache]
|
cache @watcher/chod-threads-cache]
|
||||||
;; (println "RCVD: " rqst)
|
(println "\n\nRCVD: " rqst)
|
||||||
(println rqst)
|
;; (println rqst)
|
||||||
;; ====== Errors =====
|
;; ====== Errors =====
|
||||||
;; Something other than feed.xml requested
|
;; Something other than feed.xml requested
|
||||||
(when-not (s/ends-with? uri "feed.xml")
|
(when-not (s/ends-with? uri "feed.xml")
|
||||||
(throw (ex-info "404" {:status 404
|
(throw (ex-info "404" {:status 404
|
||||||
:header {"Content-Type" "text/plain"}
|
:header {"Content-Type" "text/plain"}
|
||||||
:body "404 This server has nothing but /feed.xml"})))
|
:body "404 This server has nothing but /feed.xml"})))
|
||||||
|
(when-not (contains? @boards-enabled-cache board)
|
||||||
|
(throw (ex-info "403" {:status 403
|
||||||
|
:header {"Content-Type" "text/plain"}
|
||||||
|
:body (get @watcher/GLOBAL-CONFIG :board-disabled-message)})))
|
||||||
;; No url params -> we redirect to documentation about params
|
;; No url params -> we redirect to documentation about params
|
||||||
(when (empty? prms)
|
(when (empty? prms)
|
||||||
(throw (ex-info "302"
|
(throw (ex-info "302"
|
||||||
|
@ -149,13 +154,15 @@
|
||||||
;; There shouldn't be any problems with this mime type but if there are
|
;; There shouldn't be any problems with this mime type but if there are
|
||||||
;; replace with "text/xml", or even better, get RSS reader that is not utter shit
|
;; replace with "text/xml", or even better, get RSS reader that is not utter shit
|
||||||
:header {"Content-Type" "application/rss+xml"}
|
:header {"Content-Type" "application/rss+xml"}
|
||||||
:body (generate-feed queries real-chod repeat? cache)})
|
:body (generate-feed queries real-chod repeat? (watcher/get-thread-data board @watcher/GLOBAL-CONFIG))})
|
||||||
(catch Exception e
|
(catch Exception e
|
||||||
;; Ex-info has been crafted to match HTTP response body so we can send it
|
;; Ex-info has been crafted to match HTTP response body so we can send it
|
||||||
(if-let [caught (ex-data e)]
|
(if-let [caught (ex-data e)]
|
||||||
caught ;We have custom crafted error
|
caught ;We have custom crafted error
|
||||||
|
(do
|
||||||
|
(print "WTF??: " e)
|
||||||
{:status 500 ;Something else fucked up, we print what happened
|
{:status 500 ;Something else fucked up, we print what happened
|
||||||
:header {"Content-Type" "text/plain"}
|
:header {"Content-Type" "text/plain"}
|
||||||
:body (str "500 - Something fucked up while generating feed, If you decide to report it, please include url adress you used:\n"
|
:body (str "500 - Something fucked up while generating feed, If you decide to report it, please include url adress you used:\n"
|
||||||
(ex-cause e) "\n"
|
(ex-cause e) "\n"
|
||||||
e)}))))
|
e)})))))
|
||||||
|
|
101
src/rss_thread_watch/utils.clj
Normal file
101
src/rss_thread_watch/utils.clj
Normal file
|
@ -0,0 +1,101 @@
|
||||||
|
;; Copyright (C) 2024 Felisp
|
||||||
|
;;
|
||||||
|
;; This program is free software: you can redistribute it and/or modify
|
||||||
|
;; it under the terms of the GNU Affero General Public License as published by
|
||||||
|
;; the Free Software Foundation, version 3 of the License.
|
||||||
|
;;
|
||||||
|
;; This program is distributed in the hope that it will be useful,
|
||||||
|
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
;; GNU Affero General Public License for more details.
|
||||||
|
;;
|
||||||
|
;; You should have received a copy of the GNU Affero General Public License
|
||||||
|
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||||
|
|
||||||
|
(ns rss-thread-watch.utils
|
||||||
|
"Util functions"
|
||||||
|
(:gen-class))
|
||||||
|
|
||||||
|
;; ===== Macros =====
|
||||||
|
(defmacro nil?-else
|
||||||
|
"Return x unless it's nil, the return y"
|
||||||
|
[x y]
|
||||||
|
`(let [result# ~x]
|
||||||
|
(if (nil? result#)
|
||||||
|
~y
|
||||||
|
result#)))
|
||||||
|
|
||||||
|
(defmacro when-else
|
||||||
|
"Evaluates [tst], if it's truthy value returns that value.
|
||||||
|
If it's not, execute everything in [else] and return last expr."
|
||||||
|
[tst & else]
|
||||||
|
`(let [res# ~tst]
|
||||||
|
(if res#
|
||||||
|
res#
|
||||||
|
(do ~@else))))
|
||||||
|
|
||||||
|
(defmacro ret=
|
||||||
|
"compares two values using [=]. If the result is true
|
||||||
|
returns the value, else the result of [=].
|
||||||
|
|
||||||
|
Usefull with if-else"
|
||||||
|
[x y]
|
||||||
|
`(let [x# ~x
|
||||||
|
y# ~y
|
||||||
|
result# ~(= x y)]
|
||||||
|
(if result#
|
||||||
|
~x
|
||||||
|
result#)))
|
||||||
|
|
||||||
|
;; ===== Generic functions ====
|
||||||
|
|
||||||
|
(defn indices
|
||||||
|
;; https://stackoverflow.com/questions/8641305/find-index-of-an-element-matching-a-predicate-in-clojure
|
||||||
|
"Returns indexes of elements passing predicate"
|
||||||
|
[pred coll]
|
||||||
|
(keep-indexed #(when (pred %2) %1) coll))
|
||||||
|
|
||||||
|
(defn map-apply-defaults
|
||||||
|
"Apply default values from [defaults] to keys not present in [conf]
|
||||||
|
Order is very important.
|
||||||
|
Thus all missing values from config are replaced by defaults"
|
||||||
|
[conf defaults]
|
||||||
|
(into conf
|
||||||
|
(for [k (keys defaults)]
|
||||||
|
(let [conf-val (get conf k)
|
||||||
|
default-val (get defaults k)]
|
||||||
|
(if (and (map? conf-val) ; both are maps, we have to go level deeper
|
||||||
|
(map? default-val)) ; If only one is, we don't care cus then it's just assigment
|
||||||
|
{k (map-apply-defaults conf-val default-val)}
|
||||||
|
{k (nil?-else conf-val default-val)})))))
|
||||||
|
|
||||||
|
(defn fmap
|
||||||
|
"Applies function [f] to every key and value in map [m]
|
||||||
|
Function signature should be (f [key value])."
|
||||||
|
[f m]
|
||||||
|
(into
|
||||||
|
(empty m)
|
||||||
|
(for [[key val] m]
|
||||||
|
[key (f key val)])))
|
||||||
|
|
||||||
|
(defn expand-home
|
||||||
|
"Expands ~ to home directory"
|
||||||
|
;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for
|
||||||
|
[s]
|
||||||
|
(if (clojure.string/starts-with? s "~")
|
||||||
|
(clojure.string/replace-first s "~" (System/getProperty "user.home"))
|
||||||
|
s))
|
||||||
|
|
||||||
|
(defn expand-path
|
||||||
|
[s]
|
||||||
|
(if (clojure.string/starts-with? s "./")
|
||||||
|
(clojure.string/replace-first s "." (System/getProperty "user.dir"))
|
||||||
|
(expand-home s)))
|
||||||
|
|
||||||
|
(defn file-exists?
|
||||||
|
"Returns true if file exists"
|
||||||
|
[file]
|
||||||
|
(let [path (if (vector? file)
|
||||||
|
(first file)
|
||||||
|
file)]
|
||||||
|
(.exists (clojure.java.io/file (expand-path path)))))
|
|
@ -1,4 +1,4 @@
|
||||||
;; Copyright (C) 2023 Felisp
|
;; Copyright (C) 2024 Felisp
|
||||||
;;
|
;;
|
||||||
;; This program is free software: you can redistribute it and/or modify
|
;; This program is free software: you can redistribute it and/or modify
|
||||||
;; it under the terms of the GNU Affero General Public License as published by
|
;; it under the terms of the GNU Affero General Public License as published by
|
||||||
|
@ -18,11 +18,23 @@
|
||||||
[clojure.data.json :as js])
|
[clojure.data.json :as js])
|
||||||
(:gen-class))
|
(:gen-class))
|
||||||
|
|
||||||
(def chod-threads-cache
|
(def GLOBAL-CONFIG
|
||||||
"Cached vector of threads that have CHanceOfDeath > configured"
|
"Global config with defaults for missing entires"
|
||||||
(atom []))
|
;; I know globals are ew in Clojure but I don't know any
|
||||||
|
;; better way of doing this
|
||||||
|
(atom nil))
|
||||||
|
|
||||||
(def time-of-cache (atom 0))
|
(def chod-threads-cache
|
||||||
|
"Cached map of threads that have CHanceOfDeath > configured"
|
||||||
|
(atom {}))
|
||||||
|
|
||||||
|
(defn generate-chod-cache-structure
|
||||||
|
"Generates initial structure for global cache
|
||||||
|
Structure is returned, you have to set it yourself"
|
||||||
|
[config]
|
||||||
|
(let [ks (keys (:boards-enabled config))]
|
||||||
|
(zipmap ks
|
||||||
|
(repeatedly (count ks) #(atom nil)))))
|
||||||
|
|
||||||
(defn process-page
|
(defn process-page
|
||||||
"Procesess every thread in page, leaving only relevant information
|
"Procesess every thread in page, leaving only relevant information
|
||||||
|
@ -44,27 +56,64 @@
|
||||||
(defn build-cache
|
(defn build-cache
|
||||||
"Build cache of near-death threads so the values don't have to be recalculated on each request."
|
"Build cache of near-death threads so the values don't have to be recalculated on each request."
|
||||||
[pages-to-index pages-total threads-per-page threads-total]
|
[pages-to-index pages-total threads-per-page threads-total]
|
||||||
(vec (flatten (map (fn [single-page]
|
{:time (System/currentTimeMillis)
|
||||||
|
:data (vec (flatten (map (fn [single-page]
|
||||||
;; We have to (dec page-number) bcs otherwise we would get the total number of threads
|
;; We have to (dec page-number) bcs otherwise we would get the total number of threads
|
||||||
;; including the whole page of threads
|
;; including the whole page of threads
|
||||||
(let [page-number (dec (:page single-page))] ; inc to get to the actuall page
|
(let [page-number (dec (:page single-page))] ; inc to get to the actuall page
|
||||||
(process-page (:threads single-page) threads-total (inc (* page-number threads-per-page)))))
|
(process-page (:threads single-page) threads-total (inc (* page-number threads-per-page)))))
|
||||||
pages-to-index))))
|
pages-to-index)))})
|
||||||
|
|
||||||
(defn update-thread-cache!
|
(defn update-board-cache!
|
||||||
"Updates cache of near-death threads. Writes to chod-threads-cache as side effect.
|
"Updates cache of near-death threads. Writes to chod-threads-cache as side effect.
|
||||||
[url] - Url to download data from
|
[url] - Url to download data from
|
||||||
[starting-page] - From which page consider threads to be fit for near-death cache"
|
[board] - Board to assign cached data to, it's existence is NOT checked here
|
||||||
[url starting-page]
|
[starting-page] - From which page consider threads to be fit for near-death cache
|
||||||
;; Todo: surround with try so we can timeout and other stuff
|
THIS FUNCTION WRITES TO chod-threads-cache
|
||||||
|
Returns :data part of [board] cache"
|
||||||
|
[url board starting-page]
|
||||||
|
;; Todo: surround with try so we can timeout, 40x and other stuff
|
||||||
(let [catalog (with-open [readr (io/reader url)]
|
(let [catalog (with-open [readr (io/reader url)]
|
||||||
(js/read readr :key-fn keyword))
|
(js/read readr :key-fn keyword))
|
||||||
pages-total (count catalog)
|
pages-total (count catalog)
|
||||||
;; universal calculation for total number of threads:
|
;; universal calculation for total number of threads:
|
||||||
;; (pages-total -1) * threadsPerPage + threadsOnLastpage ;;accounts for boards which have stickied threads making them have 11pages
|
;; (pages-total -1) * threadsPerPage + threadsOnLastpage ;;accounts for boards which have stickied threads making them have 11pages
|
||||||
threads-per-page (count (:threads (first catalog)))
|
threads-per-page (count (:threads (first catalog))) ;; TODO: last could be remade to peek if it's a vector
|
||||||
threads-total (+ (* threads-per-page (dec pages-total)) (count (:threads (last catalog)))) ;; Todo: Yeah, maybe this calculation could be refactored into let
|
threads-total (+ (* threads-per-page (dec pages-total)) (count (:threads (last catalog)))) ;; Todo: Yeah, maybe this calculation could be refactored into let
|
||||||
to-index (filter (fn [item]
|
to-index (filter (fn [item]
|
||||||
(<= starting-page (:page item))) catalog)]
|
(<= starting-page (:page item))) catalog)]
|
||||||
(reset! chod-threads-cache (build-cache to-index pages-total threads-per-page threads-total))
|
;; TODO: there absolutely must be try catch for missing - not enabled boards,
|
||||||
(reset! time-of-cache (System/currentTimeMillis))))
|
;; This is probably resolved now, but keeping it just in case
|
||||||
|
;; This will return nill and that fuck everything up
|
||||||
|
(println "Refreshed cache for " board)
|
||||||
|
(reset! (get @chod-threads-cache board)
|
||||||
|
(build-cache to-index pages-total threads-per-page threads-total))))
|
||||||
|
|
||||||
|
(defn board-enabled?
|
||||||
|
"Checks whether board is enabled in config"
|
||||||
|
[board config]
|
||||||
|
(contains? board (keys (get config :boards-enabled))))
|
||||||
|
|
||||||
|
(defn get-board-url
|
||||||
|
"Gets board url from :target if "
|
||||||
|
[board config]
|
||||||
|
;; TODO: jesus, this needs sanitization and should be probably crafted by some URL class
|
||||||
|
(str (get-in config [:boards-enabled board :target]) board "catalog.json"))
|
||||||
|
|
||||||
|
(defn get-thread-data
|
||||||
|
"Gets thread cache for given board.
|
||||||
|
If board is lazy loaded, downloads new one if needed.
|
||||||
|
|
||||||
|
MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY"
|
||||||
|
[board config]
|
||||||
|
(let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate)))
|
||||||
|
{data :data
|
||||||
|
time-downloaded :time
|
||||||
|
:or {time-downloaded 0}
|
||||||
|
:as board-atom } @(get @chod-threads-cache board)
|
||||||
|
;; TODO: This also makes it implictly lazy-load -> if disabled make the check here
|
||||||
|
time-to-update? (or (nil? board-atom)
|
||||||
|
(> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))]
|
||||||
|
(if time-to-update?
|
||||||
|
(update-board-cache! (get-board-url board config) board (get-in config [:boards-enabled board :starting-page]))
|
||||||
|
@(get @chod-threads-cache board))))
|
||||||
|
|
67
test/rss_thread_watch/utils_test.clj
Normal file
67
test/rss_thread_watch/utils_test.clj
Normal file
|
@ -0,0 +1,67 @@
|
||||||
|
;; Copyright (C) 2024 Felisp
|
||||||
|
;;
|
||||||
|
;; This program is free software: you can redistribute it and/or modify
|
||||||
|
;; it under the terms of the GNU Affero General Public License as published by
|
||||||
|
;; the Free Software Foundation, version 3 of the License.
|
||||||
|
;;
|
||||||
|
;; This program is distributed in the hope that it will be useful,
|
||||||
|
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
;; GNU Affero General Public License for more details.
|
||||||
|
;;
|
||||||
|
;; You should have received a copy of the GNU Affero General Public License
|
||||||
|
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||||
|
|
||||||
|
(ns rss-thread-watch.utils-test
|
||||||
|
(:require [clojure.test :refer :all]
|
||||||
|
[rss-thread-watch.utils :refer :all]))
|
||||||
|
|
||||||
|
(def first-map
|
||||||
|
"Example config map with two keys"
|
||||||
|
{:a :b
|
||||||
|
:c "c"
|
||||||
|
:nested {:fst 1 :scnd {:super :nested}}})
|
||||||
|
|
||||||
|
(def pony-map
|
||||||
|
"Map containing none of the items in map 1"
|
||||||
|
{:best-pony "Twilight Sparkle"})
|
||||||
|
|
||||||
|
(def conflicting-basic-merge (conj pony-map {:a 17 :c 15}))
|
||||||
|
|
||||||
|
(def deep-pony-map {:a "x"
|
||||||
|
:c :something-else
|
||||||
|
:nested {:ponies "everywhere"
|
||||||
|
:fst 69}})
|
||||||
|
|
||||||
|
(def empty-map {})
|
||||||
|
|
||||||
|
(deftest map-apply-defaults-test
|
||||||
|
(testing "Full and no-replace"
|
||||||
|
(is (= first-map (map-apply-defaults first-map empty-map))
|
||||||
|
"No defaults should return conf map unchanged")
|
||||||
|
(is (= first-map (map-apply-defaults empty-map first-map))
|
||||||
|
"Empty map should be completely replaced by defaults"))
|
||||||
|
|
||||||
|
(testing "Basic merge"
|
||||||
|
(is (= (conj pony-map first-map) (map-apply-defaults first-map pony-map))
|
||||||
|
"When all keys unique, maps should be conjd")
|
||||||
|
(is (= (conj first-map pony-map) (map-apply-defaults first-map pony-map))
|
||||||
|
"When all keys unique, maps should be conjd, order matters")
|
||||||
|
(is (= (conj first-map pony-map) (map-apply-defaults pony-map first-map))
|
||||||
|
"When all keys unique, maps should be conjd, more order that matters")
|
||||||
|
(is (= (conj first-map pony-map) (map-apply-defaults first-map pony-map))
|
||||||
|
"Conflicting basic merge"))
|
||||||
|
;; Most important part, this is the reason we have the function in the first place
|
||||||
|
;; Conj wont merge deep
|
||||||
|
(testing "Nested merge"
|
||||||
|
(is (= {:a :b
|
||||||
|
:c "c"
|
||||||
|
:nested {:ponies "everywhere"
|
||||||
|
:fst 1
|
||||||
|
:scnd {:super :nested}}}
|
||||||
|
(map-apply-defaults first-map deep-pony-map)))))
|
||||||
|
|
||||||
|
(deftest fmap-test
|
||||||
|
(testing "Applying function to values of map"
|
||||||
|
(is (= {:a 2 :b 3} (fmap (fn [k v] (inc v))
|
||||||
|
{:a 1 :b 2})))))
|
Loading…
Reference in a new issue