Compare commits
28 commits
0.4.2Beta1
...
stable
Author | SHA1 | Date | |
---|---|---|---|
7c1720cd4e | |||
4b7a6e66d8 | |||
38607ee814 | |||
0e3c62fbd1 | |||
e45af756a2 | |||
55ca8f0d47 | |||
64a0f88ac4 | |||
a951e4f470 | |||
1890f14f9e | |||
18cc3e730c | |||
8d61968dc9 | |||
817790cfb4 | |||
5178ab7366 | |||
b88a471a0e | |||
ee3ad0a6e9 | |||
2464a66ac7 | |||
454643675f | |||
3555891000 | |||
389a3fa9ef | |||
1b8600c742 | |||
9a96deccb9 | |||
e871d1a6c4 | |||
20752a3b1c | |||
62f62a967f | |||
4c5ad1e923 | |||
82d920cb3d | |||
373f2f2996 | |||
6c825bcaaa |
9 changed files with 288 additions and 162 deletions
19
README.org
19
README.org
|
@ -12,11 +12,11 @@ Get notifications from your feed reader when your favourite thread is about to d
|
|||
4) Profit! RSS feed will include only the threads matching your querry so every notification your feed reader will send means your
|
||||
watched thread is about to die
|
||||
|
||||
*NOTE THAT THIS IS AN ALPHA RELEASE, IF YOUR THREAD DIES BECAUSE OF RSS-WATCHER MALLFUNCTION DO NOT BLAME ME* pls
|
||||
*NOTE THAT THIS IS AN BETA RELEASE, IF YOUR THREAD DIES BECAUSE OF RSS-WATCHER MALLFUNCTION DO NOT BLAME ME* pls
|
||||
|
||||
** Getting custom URL
|
||||
|
||||
URL without any params (just ~/feed.xml~) won't work. You must specify at least one ~q~. See bellow.
|
||||
URL without any params (just ~/feed~) won't work. You must specify at least one ~q~ or ~Q~. See bellow.
|
||||
|
||||
*** Crafting URL by hand
|
||||
|
||||
|
@ -24,13 +24,14 @@ Right now there is no automated way to generate your feed url but making one by
|
|||
|
||||
**** URL parameters
|
||||
|
||||
Please note that default values may vary depending on which host you use, these are the defaults that come with this software but
|
||||
anyone running instance of RSS thread watcher can change them
|
||||
Please note that default values may vary depending on which instance/host/board you use, these are the defaults that come with
|
||||
this software but anyone running instance of RSS thread watcher can change them
|
||||
|
||||
| Param name | Values [default] | Can have multiple? | Mandatory? | Short description |
|
||||
|------------+-------------------------+--------------------+-------------------------+--------------------------------------------------------------------------------------------------|
|
||||
| board | "mlp" | No | No | Which board to generate feed for, only boards enabled by host will work |
|
||||
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles, *REGEX NOT supported* yet |
|
||||
| Q | nil | Yes | No if ~q~ is present | This string is used to filter threads according to their titles, but is CaseSensitive |
|
||||
| chod | 60-99 [94] | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death is > chod |
|
||||
| repeat | true, paranoid, [false] | No | No (partly implemented) | Whether to make new notification on every server update even when thread doesnt have higher chod |
|
||||
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
|
||||
|
@ -40,14 +41,14 @@ anyone running instance of RSS thread watcher can change them
|
|||
Standart rules of URLs apply, if you know how to pass params in URL to any website, you don't even have to read this
|
||||
|
||||
- Open some text editor
|
||||
- Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed.xml?~ (you can use plain HTTP if you want to)
|
||||
- Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed?~ (you can use plain HTTP if you want to)
|
||||
- Now you can append any of the supported parameters (which you can find in the above table):
|
||||
- For example if we want to be informed about threads with "cute" in their title
|
||||
- ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute~
|
||||
- ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed?q=cute~
|
||||
- If you want more than one param, separate with ~&~, for example:
|
||||
- ~q=cute~ and ~q=pretty~ would be ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty~
|
||||
- ~q=cute~ and ~q=pretty~ would be ~https://tools.treebrary.org/thread-watcher/feed?q=cute&q=pretty~
|
||||
- Same is true for when you also want to specify ChoD
|
||||
- ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty&chod=98~
|
||||
- ~https://tools.treebrary.org/thread-watcher/feed?q=cute&q=pretty&chod=98~
|
||||
- This will only notify you about threads that:
|
||||
- Have ~cute~ or ~pretty~ in their title
|
||||
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
|
||||
|
@ -80,6 +81,8 @@ See [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/issues?q=&type=
|
|||
- [ ] Support async responses
|
||||
- [ ] Graal VM support for native compilation
|
||||
|
||||
For more up to date and complete list of features, check [[https://git.treebrary.org/Treebrary.org/rss-thread-watcher/projects][open projects]].
|
||||
|
||||
** Self hosting
|
||||
|
||||
As of first Beta release, self hosting is supported, please refer to [[file:res/ExampleConfig-documented.edn][documented example config]] for infomration on configuration
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
(defproject rss-thread-watch "0.4.2-SNAPSHOT"
|
||||
(defproject rss-thread-watch "0.4.9-SNAPSHOT"
|
||||
:description "RSS based thread watcher"
|
||||
:url "http://example.com/FIXME"
|
||||
:license {:name "AGPL-3.0-only"
|
||||
|
|
|
@ -3,7 +3,11 @@
|
|||
;; Message displayed when requested board is not enabled
|
||||
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact]"
|
||||
;; :enable-board-listing true ;Whether to show list of enabled boards in /boards UNIMPLEMENTED
|
||||
|
||||
;; The watcher feed will be served by this url, everything else will be 404
|
||||
:served-filename "/feed"
|
||||
;; This is homepage for your feed, it should probably redirect somewhere where you mention
|
||||
;; What things you have enabled and where to find full docs
|
||||
:homepage "https://git.treebrary.org/Treebrary.org/rss-thread-watcher"
|
||||
;; This map defines default values for all enabled boards, if you wish for some board
|
||||
;; to use different values, specify them bellow in :borads-enabled
|
||||
:boards-defaults {
|
||||
|
@ -14,14 +18,12 @@
|
|||
:starting-page 7
|
||||
;; Default ChOD to use if none is specified by the user
|
||||
:default-chod 94
|
||||
;; If you want to do some preprocessing beforehand, you can override
|
||||
;; target URL for the board, but the response must be same the 4chan API would return
|
||||
;; /$board/catalog.json will be appended to this link
|
||||
;; This is target for API requests
|
||||
:target "https://api.4chan.org"
|
||||
;; This host that has the actual threads, /board/thread-no will be appeneded
|
||||
;; to this
|
||||
:host "https://boards.4chan.org"
|
||||
;; This is target for Catalog API requests
|
||||
;; {board} will be substitued for board
|
||||
:target "https://api.4chan.org/{board}/catalog.json"
|
||||
;; This is where threads actually reside if different from :target
|
||||
;; you can use {board} and {threadnum} for substitutions
|
||||
:host "https://boards.4chan.org/{board}/thread/{threadnum}"
|
||||
;; Commented parts bellow are still unimplemented
|
||||
;; ------
|
||||
;; Only download catalog when someone requests feed and cache is old
|
||||
|
|
108
src/rss_thread_watch/config.clj
Normal file
108
src/rss_thread_watch/config.clj
Normal file
|
@ -0,0 +1,108 @@
|
|||
;; Copyright (C) 2024 Felisp
|
||||
;;
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU Affero General Public License as published by
|
||||
;; the Free Software Foundation, version 3 of the License.
|
||||
;;
|
||||
;; This program is distributed in the hope that it will be useful,
|
||||
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
;; GNU Affero General Public License for more details.
|
||||
;;
|
||||
;; You should have received a copy of the GNU Affero General Public License
|
||||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
(ns rss-thread-watch.config
|
||||
"Functions for working with configuration"
|
||||
(:require [clojure.java.io :as io]
|
||||
[clojure.edn :as edn]
|
||||
[clojure.string :as s]
|
||||
[rss-thread-watch.utils :as u])
|
||||
(:gen-class))
|
||||
|
||||
;; Verification TODO: check if all required keys are included so we don't get nils
|
||||
|
||||
(def VERSION "0.4.9")
|
||||
|
||||
(def GLOBAL-CONFIG
|
||||
"Global config with defaults for missing entires"
|
||||
;; I know globals are ew in Clojure but I don't know any
|
||||
;; better way of doing this
|
||||
(atom nil))
|
||||
|
||||
;; Internal default config
|
||||
(def CONFIG-DEFAULT
|
||||
"Internal default config"
|
||||
{:port 6969
|
||||
:default-board "/mlp/"
|
||||
:enable-board-listing true
|
||||
:served-filename "/feed"
|
||||
:homepage "https://git.treebrary.org/Treebrary.org/rss-thread-watcher"
|
||||
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
|
||||
:boards-defaults {:refresh-rate 300
|
||||
:starting-page 7
|
||||
:default-chod 94
|
||||
:target "https://api.4chan.org/{board}/catalog.json"
|
||||
:host "https://boards.4chan.org/{board}/thread/{threadnum}"
|
||||
:lazy-load true}
|
||||
:boards-enabled {"/mlp/" {:lazy-load false}
|
||||
"/g/" {:starting-page 7}
|
||||
"/po/" {:starting-page 8
|
||||
:refresh-rate 86400}
|
||||
"/p/" {:starting-page 8
|
||||
:refresh-rate 1800}}})
|
||||
|
||||
(defn load-config
|
||||
"Attempts to load config from file [f].
|
||||
Returns loaded config map or nil if failed"
|
||||
[f]
|
||||
(let [fl (io/as-file f)]
|
||||
(when (.exists fl)
|
||||
(with-open [r (io/reader fl)]
|
||||
(edn/read (java.io.PushbackReader. r))))))
|
||||
|
||||
|
||||
(defn config-url-expand
|
||||
"Expands substitution in :target and :host fields"
|
||||
[filled-config]
|
||||
(let [boards (get filled-config :boards-enabled)
|
||||
selecting '(:target :host)
|
||||
pattern "{board}"]
|
||||
(assoc filled-config
|
||||
:boards-enabled
|
||||
(u/fmap (fn [board confs]
|
||||
(->> (select-keys confs selecting)
|
||||
(u/fmap (fn [_ v]
|
||||
(s/replace v pattern (s/replace board "/" ""))))
|
||||
(merge confs)))
|
||||
boards))))
|
||||
|
||||
(defn config-fill-board-defaults
|
||||
;; TODO: must have check that if board is default, it's enabled, if it's not, give some big fat warning
|
||||
;; that users must always specify board, maybe change the error?
|
||||
"Fills every enabled board with default config values"
|
||||
[config]
|
||||
(let [defaults (:boards-defaults config)]
|
||||
(as-> config conf
|
||||
(update-in conf
|
||||
'(:boards-enabled)
|
||||
(fn [mp]
|
||||
(u/fmap (fn [k v]
|
||||
(assoc (u/map-apply-defaults v defaults) :name k))
|
||||
mp)))
|
||||
(dissoc conf :boards-defaults)
|
||||
(config-url-expand conf))))
|
||||
|
||||
(defn get-some-config
|
||||
"Attempts to get config somehow,
|
||||
first from [custom-file], if it's nil,
|
||||
then from ./config.edn file.
|
||||
If is neither exists, default internal one is used."
|
||||
[custom-file]
|
||||
(config-fill-board-defaults
|
||||
;; TODO: There has to be try/catch for when file is invalid edn
|
||||
;; This is gonna be done when config validation comes in Beta 2
|
||||
(let [file-to-try (u/nil?-else custom-file
|
||||
"./config.edn")]
|
||||
(u/when-else (load-config file-to-try)
|
||||
CONFIG-DEFAULT))))
|
|
@ -13,36 +13,15 @@
|
|||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
(ns rss-thread-watch.core
|
||||
(:require [clojure.java.io :as io]
|
||||
[clojure.edn :as edn]
|
||||
[clojure.tools.cli :refer [parse-opts]]
|
||||
(:require [clojure.tools.cli :refer [parse-opts]]
|
||||
[ring.adapter.jetty :as jetty]
|
||||
[ring.middleware.params :as rp]
|
||||
[rss-thread-watch.watcher :as watcher]
|
||||
[rss-thread-watch.feed-generator :as feed]
|
||||
[rss-thread-watch.utils :as u])
|
||||
[rss-thread-watch.utils :as u]
|
||||
[rss-thread-watch.config :as conf])
|
||||
(:gen-class))
|
||||
|
||||
(def VERSION "0.4.2")
|
||||
|
||||
;; Internal default config
|
||||
(def CONFIG-DEFAULT
|
||||
"Internal default config"
|
||||
{:port 6969
|
||||
:default-board "/mlp/"
|
||||
:enable-board-listing true
|
||||
:board-disabled-message "This board is not enabled for feed generation.\n\nYou can contact me here: [contact] and I may enable it for you"
|
||||
:boards-defaults {:refresh-rate 300
|
||||
:starting-page 7
|
||||
:default-chod 94
|
||||
:target "https://api.4chan.org"
|
||||
:lazy-load true}
|
||||
:boards-enabled {"/mlp/" {:lazy-load false}
|
||||
"/g/" {:starting-page 7}
|
||||
"/po/" {:starting-page 8
|
||||
:refresh-rate 86400}
|
||||
"/p/" {:starting-page 8
|
||||
:refresh-rate 1800}}})
|
||||
|
||||
(def cli-options
|
||||
"Configuration defining program arguments for cli.tools"
|
||||
|
@ -66,41 +45,6 @@
|
|||
(println "Error while updating cache: " e ", retrying in " (/ ms 1000 60) " minutes"))))
|
||||
(Thread/sleep ms)))))
|
||||
|
||||
(defn load-config
|
||||
"Attempts to load config from file [f].
|
||||
Returns loaded config map or nil if failed"
|
||||
[f]
|
||||
(let [fl (io/as-file f)]
|
||||
(when (.exists fl)
|
||||
(with-open [r (io/reader fl)]
|
||||
(edn/read (java.io.PushbackReader. r))))))
|
||||
|
||||
(defn config-fill-board-defaults
|
||||
"Fills every enabled board with default config values"
|
||||
[config]
|
||||
(let [defaults (:boards-defaults config)]
|
||||
(dissoc (update-in config
|
||||
'(:boards-enabled)
|
||||
(fn [mp]
|
||||
(u/fmap (fn [k v]
|
||||
(assoc (u/map-apply-defaults v defaults) :name k))
|
||||
mp)))
|
||||
:boards-defaults)))
|
||||
|
||||
(defn get-some-config
|
||||
"Attempts to get config somehow,
|
||||
first from [custom-file], if it's nil,
|
||||
then from ./config.edn file.
|
||||
If is neither exists, default internal one is used."
|
||||
[custom-file]
|
||||
(config-fill-board-defaults
|
||||
;; TODO: There has to be try/catch for when file is invalid edn
|
||||
;; This is gonna be done when config validation comes in Beta 2
|
||||
(let [file-to-try (u/nil?-else custom-file
|
||||
"./config.edn")]
|
||||
(u/when-else (load-config file-to-try)
|
||||
CONFIG-DEFAULT))))
|
||||
|
||||
(defn -main
|
||||
"Entry point, starts webserver"
|
||||
[& args]
|
||||
|
@ -110,36 +54,41 @@
|
|||
(println "Error: " err)
|
||||
(System/exit 1))
|
||||
(when (get options :version)
|
||||
(println "RSS Thread Watcher " VERSION " Licensed under AGPL-3.0-only")
|
||||
(println "RSS Thread Watcher " conf/VERSION " Licensed under AGPL-3.0-only")
|
||||
(System/exit 0))
|
||||
(when (get options :help)
|
||||
(println "RSS Thread Watcher help:\n" (get parsed-args :summary))
|
||||
(System/exit 0))
|
||||
(when (get options :print-default-config)
|
||||
(println ";;Default internal config file from RSS Thread Watcher " VERSION)
|
||||
(clojure.pprint/pprint CONFIG-DEFAULT)
|
||||
(println ";;Default internal config file from RSS Thread Watcher " conf/VERSION)
|
||||
(clojure.pprint/pprint conf/CONFIG-DEFAULT)
|
||||
;; In case someone was copying by hand, this might be useful
|
||||
(println ";;END of Default internal config file")
|
||||
(System/exit 0))
|
||||
|
||||
(let [config (get-some-config (:config options))]
|
||||
(let [config (conf/get-some-config (:config options))]
|
||||
;; TODO: probably refactor to use separate config.clj file when validation will be added
|
||||
;; Init the few globals we have
|
||||
(reset! watcher/GLOBAL-CONFIG config)
|
||||
;; TODO: this all needs to go in separate function so it doesnt have to duplicated in repl-main
|
||||
(reset! conf/GLOBAL-CONFIG config)
|
||||
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
|
||||
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config))
|
||||
(clojure.pprint/pprint config)
|
||||
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG-DEFAULT)
|
||||
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port config)
|
||||
:join? true}))))
|
||||
|
||||
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
|
||||
(defn repl-main
|
||||
"Development entry point"
|
||||
[]
|
||||
(let [config (conf/get-some-config nil)]
|
||||
;; TODO: probably refactor to use separate config.clj file when validation will be added
|
||||
;; Init the few globals we have
|
||||
(reset! conf/GLOBAL-CONFIG config)
|
||||
(reset! feed/boards-enabled-cache (set (keys (get config :boards-enabled))))
|
||||
(reset! watcher/chod-threads-cache (watcher/generate-chod-cache-structure config)))
|
||||
(jetty/run-jetty (rp/wrap-params #'feed/http-handler)
|
||||
{:port (:port CONFIG-DEFAULT)
|
||||
{:port (:port conf/CONFIG-DEFAULT)
|
||||
;; Dont block REPL thread
|
||||
:join? false}))
|
||||
;; (repl-main)
|
||||
;; Single cache update for repl
|
||||
;; (watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG))
|
||||
|
|
|
@ -14,12 +14,13 @@
|
|||
|
||||
(ns rss-thread-watch.feed-generator
|
||||
"Generates feeds for requests"
|
||||
(:require [ring.middleware.params :as rp]
|
||||
[ring.util.response :as response]
|
||||
(:require [ring.util.response :as response]
|
||||
[clj-rss.core :as rss]
|
||||
[clojure.string :as s]
|
||||
[rss-thread-watch.watcher :as watcher]
|
||||
[rss-thread-watch.utils :as ut])
|
||||
[rss-thread-watch.utils :as ut]
|
||||
[rss-thread-watch.config :as conf]
|
||||
[rss-thread-watch.filters :as f])
|
||||
(:gen-class))
|
||||
|
||||
(def boards-enabled-cache
|
||||
|
@ -33,8 +34,8 @@
|
|||
This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)"
|
||||
[thread time]
|
||||
(assoc thread :guid (str (:no thread)
|
||||
"-"
|
||||
time)))
|
||||
"-"
|
||||
time)))
|
||||
|
||||
(defn new-guid-paranoid
|
||||
"Generate unique GUID on EVERY request to the feed.
|
||||
|
@ -50,16 +51,24 @@
|
|||
|
||||
This is done by concating thread-number and it's rounded chod"
|
||||
[thread]
|
||||
(assoc thread :guid (format "%d-%.2f"
|
||||
(assoc thread :guid (format "%d-%d-%.2f"
|
||||
(:no thread)
|
||||
(:last-modified thread)
|
||||
(:chod thread))))
|
||||
|
||||
(defn make-filters
|
||||
"Creates map of functions and filters from query string.
|
||||
Return format is: {filter-fun ['words' 'to' 'filter' 'using this function]}"
|
||||
[query-string known-filter-map]
|
||||
(let [filterable (select-keys query-string
|
||||
(keys known-filter-map))]
|
||||
(ut/fkmap (fn [k v]
|
||||
{(get known-filter-map k) (ut/vectorize v)})
|
||||
filterable)))
|
||||
|
||||
(defn filter-chod-posts
|
||||
"Return list of all threads with equal or higher ChoD than requested
|
||||
|
||||
READS FROM GLOBALS: watcher.time-of-cache"
|
||||
[query-vec chod-treshold repeat? board-cache]
|
||||
|
||||
"Return list of all threads with equal or higher ChoD than requested"
|
||||
[filters chod-treshold repeat? board-cache]
|
||||
(let [{time-of-generation :time
|
||||
cache :data} board-cache
|
||||
guid-fn (case repeat?
|
||||
|
@ -67,98 +76,104 @@
|
|||
"true" (fn [x] (new-guid-always x time-of-generation))
|
||||
update-only-guid)
|
||||
cache-start-index (first (ut/indices (fn [x] (>= (:chod x) chod-treshold))
|
||||
cache))
|
||||
cache))
|
||||
;; So we don't have to search thru everything we have cached
|
||||
needed-cache-part (subvec cache cache-start-index)
|
||||
actuall-matches (keep (fn [t]
|
||||
(let [title (:title t)]
|
||||
;; Todo: Man, wouldn't it be cool to know which querry matched the thread?
|
||||
;; Would be so much easier for user to figure out why is it showing
|
||||
;; and it would solve the problem of super long titles (or OPs instead of titles)
|
||||
(when (some (fn [querry]
|
||||
(s/includes? (s/lower-case title) (s/lower-case querry)))
|
||||
query-vec)
|
||||
t)))
|
||||
actuall-matches (keep (fn [thread]
|
||||
(some
|
||||
(fn [fun]
|
||||
(when (fun thread (get filters fun))
|
||||
thread))
|
||||
(keys filters)))
|
||||
(reverse needed-cache-part))]
|
||||
;; Finally generate and append GUIDs
|
||||
(map guid-fn actuall-matches)))
|
||||
|
||||
(defn thread-to-rss-item
|
||||
"Converts cached thread item to feed item which can be serialized into RSS"
|
||||
[t host board]
|
||||
(let [link-url (str host board (:no t))]
|
||||
{:title (format "%.2f%% - %s" (:chod t) (:title t)) ;TODO: Generate link from the target somehow, or just include it from API response
|
||||
[t host]
|
||||
(let [link-url (s/replace host "{threadnum}" (str (:no t)))]
|
||||
{:title (format "%.2f%% - %s" (:chod t) (:title t))
|
||||
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
|
||||
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
|
||||
:link link-url
|
||||
:guid (:guid t)}))
|
||||
|
||||
(defn generate-feed
|
||||
"Generates feed from matching items"
|
||||
[query-vec chod-treshold repeat? cache board-config]
|
||||
(let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
|
||||
head {:title "RSS Thread watcher v0.4.2" ;TODO: hardcoded string here, remake to reference to config.clj
|
||||
:link "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||
:feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||
"Generates feed from matching items
|
||||
|
||||
READS FROM GLOBALS:
|
||||
rss-thread-watch.config/VERSION
|
||||
rss-thread-watch.config/GLOBAL_CONFIG"
|
||||
[filters chod-treshold repeat? cache board-config self-link]
|
||||
(let [items (filter-chod-posts filters chod-treshold repeat? cache)
|
||||
head {:title (str "RSS Thread watcher v" conf/VERSION)
|
||||
;; :link is the homepage of the channel
|
||||
:link (get @conf/GLOBAL-CONFIG :homepage)
|
||||
;; :feed-url is where you can get new items, must much the url this is served at
|
||||
:feed-url self-link
|
||||
:description "RSS based thread watcher"}
|
||||
body (map #(thread-to-rss-item
|
||||
%1
|
||||
(get board-config :host)
|
||||
(get board-config :name)) items)]
|
||||
(get board-config :host)) items)]
|
||||
(rss/channel-xml head body)))
|
||||
|
||||
(defn http-handler
|
||||
"Handles HTTP requests, returns generated feed
|
||||
|
||||
READS FROM GLOBALS:
|
||||
rss-thread-watch.watcher.chod-threads-cache
|
||||
rss-thread-watch.watcher.GLOBAL-CONFIG" ;TODO: Update if it really reads from there anymore
|
||||
rss-thread-watch.watcher/chod-threads-cache
|
||||
rss-thread-watch.config/GLOBAL-CONFIG"
|
||||
[rqst]
|
||||
(try (let [{{chod "chod"
|
||||
(try (let [served-filename (get @conf/GLOBAL-CONFIG :served-filename)
|
||||
{{chod "chod"
|
||||
board "board"
|
||||
repeat? "repeat" :or {chod "94"
|
||||
board (get @watcher/GLOBAL-CONFIG :default-board)
|
||||
board (get @conf/GLOBAL-CONFIG :default-board)
|
||||
repeat? false}
|
||||
:as prms} :params
|
||||
uri :uri} rqst
|
||||
qrs (prms "q")
|
||||
queries (if (vector? qrs) qrs [qrs]) ; to always return vector
|
||||
real-chod (if-let [ch (or (and (vector? chod)
|
||||
(first chod))
|
||||
chod)]
|
||||
(try ;If we can't parse number from chod, use default 94
|
||||
(if (or (vector? chod)
|
||||
(<= (Integer/parseInt chod) 60)) ; Never accept chod lower than 60 TODO: don't hardcode this
|
||||
60 (Integer/parseInt chod))
|
||||
(catch Exception e
|
||||
94)))
|
||||
board-config (get-in @watcher/GLOBAL-CONFIG [:boards-enabled board])
|
||||
uri :uri
|
||||
query :query-string
|
||||
scheme :scheme
|
||||
server-name :server-name} rqst
|
||||
filters (make-filters prms f/known-filters)
|
||||
;; BUG if local fileserver not running -> FileNotFound exception is thrown and it fucks up the feed generation
|
||||
;; Should be handled because wrong config and thus url generation could do the same
|
||||
self-uri (str (s/replace-first scheme ":" "") ;
|
||||
"://" server-name uri "?" query)
|
||||
board-config (get-in @conf/GLOBAL-CONFIG [:boards-enabled board])
|
||||
real-chod (try (max (Integer/parseInt (or (and (vector? chod)
|
||||
(first chod))
|
||||
chod)) 60) ;HARDCODED CHoD
|
||||
(catch Exception _
|
||||
(get board-config :default-chod)))
|
||||
cache @watcher/chod-threads-cache]
|
||||
(println "\n\nRCVD: " rqst)
|
||||
;; (println rqst)
|
||||
;; ====== Errors =====
|
||||
;; Something other than feed.xml requested
|
||||
(when-not (s/ends-with? uri "feed.xml")
|
||||
;; Something other than $served-filename requested
|
||||
(when-not (s/ends-with? uri served-filename)
|
||||
(throw (ex-info "404" {:status 404
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body "404 This server has nothing but /feed.xml"})))
|
||||
:body (str "404 This server has nothing but " served-filename)})))
|
||||
(when-not (contains? @boards-enabled-cache board)
|
||||
(throw (ex-info "403" {:status 403
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body (get @watcher/GLOBAL-CONFIG :board-disabled-message)})))
|
||||
:body (get @conf/GLOBAL-CONFIG :board-disabled-message)})))
|
||||
;; No url params -> we redirect to documentation about params
|
||||
(when (empty? prms)
|
||||
(throw (ex-info "302"
|
||||
(response/redirect "https://git.treebrary.org/Treebrary.org/rss-thread-watcher#headline-4"))))
|
||||
(response/redirect (get @conf/GLOBAL-CONFIG :homepage)))))
|
||||
|
||||
;; No querry specified - don't know what to search for
|
||||
(when-not (prms "q")
|
||||
(when-not (some f/known-filter-set (keys prms))
|
||||
(throw (ex-info "400" {:status 400
|
||||
:header {"Content-Type" "text/plain"}
|
||||
:body (str "400 You MUST specify query with one OR more'q=searchTerm' url parameter(s)\n\n\n"
|
||||
"Exmple: '/feed.xml?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'"
|
||||
:body (str "400 You MUST specify query with one OR more'q=searchTerm' (or 'Q=SeARChteRm' for case sensitive) url parameter(s)\n\n\n"
|
||||
"Exmple: '" served-filename "?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'"
|
||||
" in their title that are about to die.")})))
|
||||
;; Whether cache has been generated yet
|
||||
|
||||
(when (empty? cache)
|
||||
(throw (ex-info "503" {:status 503
|
||||
:header {"Content-Type" "text/plain"}
|
||||
|
@ -169,7 +184,7 @@
|
|||
;; There shouldn't be any problems with this mime type but if there are
|
||||
;; replace with "text/xml", or even better, get RSS reader that is not utter shit
|
||||
:header {"Content-Type" "application/rss+xml"}
|
||||
:body (generate-feed queries real-chod repeat? (watcher/get-thread-data board @watcher/GLOBAL-CONFIG) board-config)})
|
||||
:body (generate-feed filters real-chod repeat? (watcher/get-thread-data board @conf/GLOBAL-CONFIG) board-config self-uri)})
|
||||
(catch Exception e
|
||||
;; Ex-info has been crafted to match HTTP response body so we can send it
|
||||
(if-let [caught (ex-data e)]
|
||||
|
|
37
src/rss_thread_watch/filters.clj
Normal file
37
src/rss_thread_watch/filters.clj
Normal file
|
@ -0,0 +1,37 @@
|
|||
;; Copyright (C) 2024 Felisp
|
||||
;;
|
||||
;; This program is free software: you can redistribute it and/or modify
|
||||
;; it under the terms of the GNU Affero General Public License as published by
|
||||
;; the Free Software Foundation, version 3 of the License.
|
||||
;;
|
||||
;; This program is distributed in the hope that it will be useful,
|
||||
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
;; GNU Affero General Public License for more details.
|
||||
;;
|
||||
;; You should have received a copy of the GNU Affero General Public License
|
||||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||
|
||||
(ns rss-thread-watch.filters
|
||||
"Functions filtering posts"
|
||||
(:require [clojure.string :as cs])
|
||||
(:gen-class))
|
||||
|
||||
(defn case-sensitive-filter
|
||||
"Returns true if string [s] is matched by any query. It's case insensitive"
|
||||
[{:keys [title]} queries]
|
||||
(some (fn [querry]
|
||||
(cs/includes? title querry))
|
||||
queries))
|
||||
|
||||
(defn case-insensitive-filter
|
||||
"Returns true if string [s] is case-matched by query"
|
||||
[{:keys [title]} queries]
|
||||
(case-sensitive-filter {:title (cs/lower-case title)} (map cs/lower-case queries)))
|
||||
|
||||
(def known-filters
|
||||
{"Q" case-sensitive-filter
|
||||
"q" case-insensitive-filter})
|
||||
|
||||
(def known-filter-set (set (keys known-filters)))
|
||||
|
|
@ -47,6 +47,11 @@
|
|||
~x
|
||||
result#)))
|
||||
|
||||
(defmacro vectorize
|
||||
"If arg is not a vector, put into vector, otherwise return it"
|
||||
[v]
|
||||
(if (vector? v) v [v]))
|
||||
|
||||
;; ===== Generic functions ====
|
||||
|
||||
(defn indices
|
||||
|
@ -69,15 +74,28 @@
|
|||
{k (map-apply-defaults conf-val default-val)}
|
||||
{k (nil?-else conf-val default-val)})))))
|
||||
|
||||
;; This is a shitty version of reduce-kv
|
||||
(defn fmap
|
||||
"Applies function [f] to every key and value in map [m]
|
||||
Function signature should be (f [key value])."
|
||||
Function signature should be (f [key value]).
|
||||
Key stays unchanged"
|
||||
[f m]
|
||||
(into
|
||||
(empty m)
|
||||
(for [[key val] m]
|
||||
[key (f key val)])))
|
||||
|
||||
(defn fkmap
|
||||
;; I am horrible with docstrings, I don't deny that
|
||||
"Applies function [f] to every key and value in map [m]
|
||||
Function signature should be (f [key value]).
|
||||
Unlike fmap, you can change key too, so return both {key value} in map"
|
||||
[f m]
|
||||
(into
|
||||
(empty m)
|
||||
(for [[key val] m]
|
||||
(f key val))))
|
||||
|
||||
(defn expand-home
|
||||
"Expands ~ to home directory"
|
||||
;;modified from sauce: https://stackoverflow.com/questions/29585928/how-to-substitute-path-to-home-for
|
||||
|
|
|
@ -18,12 +18,6 @@
|
|||
[clojure.data.json :as js])
|
||||
(:gen-class))
|
||||
|
||||
(def GLOBAL-CONFIG
|
||||
"Global config with defaults for missing entires"
|
||||
;; I know globals are ew in Clojure but I don't know any
|
||||
;; better way of doing this
|
||||
(atom nil))
|
||||
|
||||
(def chod-threads-cache
|
||||
"Cached map of threads that have CHanceOfDeath > configured"
|
||||
(atom {}))
|
||||
|
@ -38,7 +32,7 @@
|
|||
|
||||
(defn process-page
|
||||
"Procesess every thread in page, leaving only relevant information
|
||||
(title no chod)"
|
||||
(:title or :com, :no :chod :last_modified)"
|
||||
([threads-to-index threads-total starting-index] (process-page threads-to-index threads-total starting-index (transient [])))
|
||||
([remaining-threads threads-total index ret]
|
||||
(if (empty? remaining-threads)
|
||||
|
@ -47,17 +41,16 @@
|
|||
(recur (rest remaining-threads)
|
||||
threads-total
|
||||
(inc index)
|
||||
;; We have to somehow include URL which is a problem since the catalog does not contain any
|
||||
;; I of course know how to craft it but the result will be kind of 4chan specific
|
||||
(conj! ret {:title (or (:sub thread) ;We use thread title if thread has it
|
||||
(:com thread) ;we use body if thread has it
|
||||
"") ;Thread has neither, this prevents null pointer
|
||||
:no (:no thread)
|
||||
:chod (* 100 (float (/ index threads-total)))}))))))
|
||||
:chod (* 100 (float (/ index threads-total)))
|
||||
:last-modified (:last_modified thread)}))))))
|
||||
|
||||
(defn build-cache
|
||||
"Build cache of near-death threads so the values don't have to be recalculated on each request."
|
||||
[pages-to-index pages-total threads-per-page threads-total]
|
||||
[pages-to-index threads-per-page threads-total]
|
||||
{:time (System/currentTimeMillis)
|
||||
:data (vec (flatten (map (fn [single-page]
|
||||
;; We have to (dec page-number) bcs otherwise we would get the total number of threads
|
||||
|
@ -71,8 +64,9 @@
|
|||
[url] - Url to download data from
|
||||
[board] - Board to assign cached data to, it's existence is NOT checked here
|
||||
[starting-page] - From which page consider threads to be fit for near-death cache
|
||||
THIS FUNCTION WRITES TO chod-threads-cache
|
||||
Returns :data part of [board] cache"
|
||||
Returns :data part of [board] cache
|
||||
THIS FUNCTION WRITES TO:
|
||||
rss-thread-watch.watcher/chod-threads-cache"
|
||||
[url board starting-page]
|
||||
;; Todo: surround with try so we can timeout, 40x and other stuff
|
||||
(let [catalog (with-open [readr (io/reader url)]
|
||||
|
@ -89,7 +83,7 @@
|
|||
;; This will return nill and that fuck everything up
|
||||
(println "Refreshed cache for " board)
|
||||
(reset! (get @chod-threads-cache board)
|
||||
(build-cache to-index pages-total threads-per-page threads-total))))
|
||||
(build-cache to-index threads-per-page threads-total))))
|
||||
|
||||
(defn board-enabled?
|
||||
"Checks whether board is enabled in config"
|
||||
|
@ -109,13 +103,13 @@
|
|||
MAY CAUSE WRITE TO chod-thread-cache IF NECCESARRY"
|
||||
[board config]
|
||||
(let [refresh-rate (* 1000 (get-in config `(:boards-enabled ~board :refresh-rate)))
|
||||
{data :data
|
||||
time-downloaded :time
|
||||
board-catalog-url (get-in config `(:boards-enabled ~board :target))
|
||||
{time-downloaded :time
|
||||
:or {time-downloaded 0}
|
||||
:as board-atom } @(get @chod-threads-cache board)
|
||||
;; TODO: This also makes it implictly lazy-load -> if disabled make the check here
|
||||
time-to-update? (or (nil? board-atom)
|
||||
(> (System/currentTimeMillis) (+ refresh-rate time-downloaded)))]
|
||||
(if time-to-update?
|
||||
(update-board-cache! (get-board-url board config) board (get-in config [:boards-enabled board :starting-page]))
|
||||
(update-board-cache! board-catalog-url board (get-in config [:boards-enabled board :starting-page]))
|
||||
@(get @chod-threads-cache board))))
|
||||
|
|
Loading…
Reference in a new issue