Compare commits
10 commits
51345db93f
...
245f7a941c
Author | SHA1 | Date | |
---|---|---|---|
245f7a941c | |||
d88de2ec0f | |||
89d6053c9d | |||
ed9875c54e | |||
f392c8f897 | |||
f36f5be5d0 | |||
2173d6f7fa | |||
f1276154ac | |||
31ed46d6d9 | |||
45d7c671c7 |
3 changed files with 163 additions and 34 deletions
40
README.org
40
README.org
|
@ -20,16 +20,33 @@ Right now there is no automated way to generate your feed url but making one by
|
||||||
|
|
||||||
**** URL parameters
|
**** URL parameters
|
||||||
|
|
||||||
| Param name | Default value | Can have multiple? | Mandatory? | Short description |
|
| Param name | Default value | Can have multiple? | Mandatory? | Short description |
|
||||||
|------------+---------------+--------------------+-----------------+--------------------------------------------------------------------------------------------------|
|
|------------+---------------+--------------------+----------------------+--------------------------------------------------------------------------------------------------|
|
||||||
| board | "mlp" | No | No | Which board to generate feed for, *ONLY* /mlp/ is supported |
|
| board | "mlp" | No | No (not implemented) | Which board to generate feed for, *ONLY* /mlp/ is supported |
|
||||||
| q | "" | Yes | Yes (1 or more) | This string is used to filter threads according to their titles |
|
| q | nil | Yes | Yes (1 or more) | This string is used to filter threads according to their titles |
|
||||||
| chod | 94 | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death i > chod |
|
| chod | 94 | No | No | CHanceOfDeath - will include thread in the feed if it's chance to death i > chod |
|
||||||
| repeat | ~false~ | No | No | Whether to make new notification on every server update even when thread doesnt have higher chod |
|
| repeat | ~false~ | No | No | Whether to make new notification on every server update even when thread doesnt have higher chod |
|
||||||
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
|
| recreate | ~bool~ | Not implemented | Not implemented | Whether to notify when creation of new thread matching querry is detected (uses 4chans RSS) |
|
||||||
|
|
||||||
**** How to create URL
|
**** How to create URL
|
||||||
|
|
||||||
|
- Standart rules of URLs apply, if you know how to pass params in URL to any website, you don't even have to read this
|
||||||
|
- Open some text editor
|
||||||
|
- Paste in default URL: ~https://tools.treebrary.org/thread-watcher/feed.xml?~ (you can use plain HTTP if you want to)
|
||||||
|
- Now you can append any of the supported parameters (which you can find in the above table):
|
||||||
|
- For example if we want to be informed about threds with "cute" in their title
|
||||||
|
- ~q=cute~ which would make ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute~
|
||||||
|
- If you want more than one param, separate with ~&~, for example:
|
||||||
|
- ~q=cute~ and ~q=pretty~ ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty~
|
||||||
|
- Same is true for when you also want to specify ChoD
|
||||||
|
- ~https://tools.treebrary.org/thread-watcher/feed.xml?q=cute&q=pretty&chod=98~
|
||||||
|
- This will only notify you about threads that:
|
||||||
|
- Have ~cute~ or ~pretty~ in their title
|
||||||
|
- Are in the lowest 98% part of catalog (it's on position ~147/150 e.g. 3 threads before being bumped off)
|
||||||
|
- Note that ~//~ are not special characters ~q=/general/~ will work as expected and match thread with "/general/" in it's title
|
||||||
|
- Also note that regex is *NOT* supported for now, so something like ~q=rainbow*~ will only match threads with "rainbow[asterisk]"
|
||||||
|
in their title
|
||||||
|
|
||||||
*** Generating URL interactively
|
*** Generating URL interactively
|
||||||
|
|
||||||
Coming soon
|
Coming soon
|
||||||
|
@ -42,8 +59,15 @@ This is an experimental project. There are several limitations:
|
||||||
|
|
||||||
** Feature set
|
** Feature set
|
||||||
|
|
||||||
- Planned/finnished features
|
- Planned/finnished features [7%]
|
||||||
|
- [X] [DONE] Super basic features done (feed, query, repeat)
|
||||||
|
- [ ] No params request should redirect to url generator or (for now) documentation
|
||||||
|
- [ ] Have proper sorting - The most likely to die threads first (stačí dát reverse u posledního vstupu filtru?)
|
||||||
- [ ] Config file instead of hardcoding config values
|
- [ ] Config file instead of hardcoding config values
|
||||||
|
- [ ] Include time of latest data fetch
|
||||||
|
- [ ] Make threads have preview images taken from the actuall thread OP
|
||||||
|
- [ ] Show which query matched the thread you were notified of
|
||||||
|
- [ ] Option to include advanced HTML formating of text (different color text for ChoD etc)
|
||||||
- [ ] Support notification on watched thread re-creation after it died
|
- [ ] Support notification on watched thread re-creation after it died
|
||||||
- [ ] Support notification for thread death
|
- [ ] Support notification for thread death
|
||||||
- [ ] Support multiple boards at once
|
- [ ] Support multiple boards at once
|
||||||
|
|
|
@ -1,29 +1,65 @@
|
||||||
|
;; Copyright (C) 2023 Felisp
|
||||||
|
;;
|
||||||
|
;; This program is free software: you can redistribute it and/or modify
|
||||||
|
;; it under the terms of the GNU Affero General Public License as published by
|
||||||
|
;; the Free Software Foundation, version 3 of the License.
|
||||||
|
;;
|
||||||
|
;; This program is distributed in the hope that it will be useful,
|
||||||
|
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||||
|
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||||
|
;; GNU Affero General Public License for more details.
|
||||||
|
;;
|
||||||
|
;; You should have received a copy of the GNU Affero General Public License
|
||||||
|
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||||
|
|
||||||
(ns rss-thread-watch.core
|
(ns rss-thread-watch.core
|
||||||
(:require [ring.adapter.jetty :as jetty])
|
(:require [ring.adapter.jetty :as jetty]
|
||||||
|
[ring.middleware.params :as rp]
|
||||||
|
[rss-thread-watch.watcher :as watcher]
|
||||||
|
[rss-thread-watch.feed-generator :as feed])
|
||||||
(:gen-class))
|
(:gen-class))
|
||||||
|
|
||||||
;; Internal default config
|
;; Internal default config
|
||||||
(def CONFIG
|
(def CONFIG
|
||||||
"Internal default config"
|
"Internal default config"
|
||||||
{:target "https://api.4chan.org/mlp/catalog.json" ;Where to download catalog from
|
{:target "https://api.4chan.org/mlp/catalog.json" ;Where to download catalog from
|
||||||
:start-index-at 50.0 ;We will search all threads that are lower in catalog than this % value
|
:starting-page 7 ;only monitor threads from this from this page and up
|
||||||
:refresh-delay (* 60 5) ;Redownload catalog every 5 mins
|
:refresh-delay (* 60 5) ;Redownload catalog every 5 mins
|
||||||
:port 6969 ;Liston on 6969
|
:port 6969 ;Listen on 6969
|
||||||
})
|
})
|
||||||
|
|
||||||
|
(defn set-interval
|
||||||
|
"Calls function every ms"
|
||||||
|
[callback ms]
|
||||||
|
(future (while true (do (try
|
||||||
|
(callback)
|
||||||
|
(println "Recached")
|
||||||
|
(catch Exception e
|
||||||
|
(binding [*out* *err*]
|
||||||
|
(println "Error while updating cache: " e ", retrying in 5 minutes"))))
|
||||||
|
(Thread/sleep ms)))))
|
||||||
|
|
||||||
(defn -main
|
(defn -main
|
||||||
"Entry point, starts webserver"
|
"Entry point, starts webserver"
|
||||||
[& args]
|
[& args]
|
||||||
())
|
(println "Starting on port: " (:port CONFIG)
|
||||||
|
"\nGonna recache every: " (:refresh-delay CONFIG) "s")
|
||||||
(defn handler [rqst]
|
(set-interval (fn []
|
||||||
{:status 404
|
(println "Starting cache update")
|
||||||
:header {"Content-Type" "text/html"}
|
(watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG)))
|
||||||
:body "No poines here ^:("})
|
(* 1000 (:refresh-delay CONFIG)))
|
||||||
|
(jetty/run-jetty (rp/wrap-params feed/http-handler) {:port (:port CONFIG)
|
||||||
|
:join? true}))
|
||||||
|
|
||||||
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
|
;; Docs: https://github.com/ring-clojure/ring/wiki/Getting-Started
|
||||||
(defn repl-main
|
(defn repl-main
|
||||||
|
"Development entry point"
|
||||||
[]
|
[]
|
||||||
(jetty/run-jetty handler {:port (:port CONFIG)
|
(jetty/run-jetty (rp/wrap-params #'feed/http-handler)
|
||||||
;; Dont block REPL thread
|
{:port (:port CONFIG)
|
||||||
:join? false}))
|
;; Dont block REPL thread
|
||||||
|
:join? false}))
|
||||||
|
;; (repl-main)
|
||||||
|
;; Single cache update for repl
|
||||||
|
;; (watcher/update-thread-cache! (:target CONFIG) (:starting-page CONFIG))
|
||||||
|
;; (watcher/update-thread-cache! "/home/michal/Zdrojaky/Clojure/rss-thread-watch/resources/catalog-pts9.json" (:starting-page CONFIG))
|
||||||
|
|
|
@ -13,11 +13,11 @@
|
||||||
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
;; along with this program. If not, see <https://www.gnu.org/licenses/>.
|
||||||
|
|
||||||
(ns rss-thread-watch.feed-generator
|
(ns rss-thread-watch.feed-generator
|
||||||
"Generates feed for requests"
|
"Generates feeds for requests"
|
||||||
(:require [ring.middleware.params :as rp]
|
(:require [ring.middleware.params :as rp]
|
||||||
[clj-rss.core :as rss]
|
[clj-rss.core :as rss]
|
||||||
[clojure.string :as s]
|
[clojure.string :as s]
|
||||||
[rss-thread-watch.watcher :as cache])
|
[rss-thread-watch.watcher :as watcher])
|
||||||
(:gen-class))
|
(:gen-class))
|
||||||
|
|
||||||
|
|
||||||
|
@ -34,7 +34,7 @@
|
||||||
|
|
||||||
This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)"
|
This is done by always making new GUID - (concat thread-number UNIX-time-of-data-update)"
|
||||||
[thread time]
|
[thread time]
|
||||||
(assoc thread guid (str (:no thread)
|
(assoc thread :guid (str (:no thread)
|
||||||
"-"
|
"-"
|
||||||
time)))
|
time)))
|
||||||
|
|
||||||
|
@ -45,20 +45,23 @@
|
||||||
[thread]
|
[thread]
|
||||||
(assoc thread :guid (format "%d-%.2f"
|
(assoc thread :guid (format "%d-%.2f"
|
||||||
(:no thread)
|
(:no thread)
|
||||||
:chod thread)))
|
(:chod thread))))
|
||||||
|
|
||||||
(defn filter-chod-posts
|
(defn filter-chod-posts
|
||||||
"Return list of all threads with equal or higher ChoD than requested"
|
"Return list of all threads with equal or higher ChoD than requested"
|
||||||
[query-vec chod-treshold repeat?]
|
[query-vec chod-treshold repeat? cache]
|
||||||
(let [time-of-generation (System/currentTimeMillis)
|
(let [time-of-generation (System/currentTimeMillis)
|
||||||
guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
|
guid-fn (if repeat? (fn [x] (new-guid-always x time-of-generation))
|
||||||
update-only-guid)
|
update-only-guid)
|
||||||
cache-start-index (first (indices (fn [x] (>= x chod-treshold))
|
cache-start-index (first (indices (fn [x] (>= (:chod x) chod-treshold))
|
||||||
@cache/chod-threads-cache))
|
cache))
|
||||||
;; So we don't have to search thru everything we have cached
|
;; So we don't have to search thru everything we have cached
|
||||||
needed-cache-part (subvec @cache/chod-threads-cache cache-start-index)
|
needed-cache-part (subvec cache cache-start-index) ;Todo: remove that ugly global reference
|
||||||
actuall-matches (keep (fn [t]
|
actuall-matches (keep (fn [t]
|
||||||
(let [title (:title t)]
|
(let [title (:title t)]
|
||||||
|
;; Todo: Man, wouldn't it be cool to know which querry matched the thread?
|
||||||
|
;; Would be so much easier for user to figure out why is it showing
|
||||||
|
;; and it would solve the problem of super long titles (or OPs instead of titles)
|
||||||
(when (some (fn [querry]
|
(when (some (fn [querry]
|
||||||
(s/includes? title querry))
|
(s/includes? title querry))
|
||||||
query-vec)
|
query-vec)
|
||||||
|
@ -67,6 +70,28 @@
|
||||||
;; Finally generate and append GUIDs
|
;; Finally generate and append GUIDs
|
||||||
(map guid-fn actuall-matches)))
|
(map guid-fn actuall-matches)))
|
||||||
|
|
||||||
|
(defn thread-to-rss-item
|
||||||
|
"If I wasnt retarded I could have made the cached version look like
|
||||||
|
rss item already but what can you do. I'll refactor I promise, I just need this done ASAP" ;Todo: do what the docstring says
|
||||||
|
[t]
|
||||||
|
(let [link-url (str "https://boards.4chan.org/mlp/thread/" (:no t))] ; jesus, well I said only /mlp/ is supported now so fuck it
|
||||||
|
{:title (format "%.2f%% - %s" (:chod t) (:title t))
|
||||||
|
;; :url link-url <- this is supposed to be for images according to: https://cyber.harvard.edu/rss/rss.html
|
||||||
|
:description (format "The thread: '%s' has %.2f%% chance of dying" (:title t) (:chod t))
|
||||||
|
:link link-url
|
||||||
|
:guid (:guid t)}))
|
||||||
|
|
||||||
|
(defn generate-feed
|
||||||
|
"Generates feed from matching items"
|
||||||
|
[query-vec chod-treshold repeat? cache]
|
||||||
|
(let [items (filter-chod-posts query-vec chod-treshold repeat? cache)
|
||||||
|
head {:title "RSS Thread watcher v0.1"
|
||||||
|
:link "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||||
|
:feed-url "https://tools.treebrary.org/thread-watcher/feed.xml"
|
||||||
|
:description "RSS based thread watcher"}
|
||||||
|
body (map thread-to-rss-item items)]
|
||||||
|
(rss/channel-xml head body)))
|
||||||
|
|
||||||
(defn http-handler
|
(defn http-handler
|
||||||
"Handles HTTP requests, returns generated feed
|
"Handles HTTP requests, returns generated feed
|
||||||
|
|
||||||
|
@ -74,7 +99,51 @@
|
||||||
rss-thread-watch.watcher.chod-threads-cache
|
rss-thread-watch.watcher.chod-threads-cache
|
||||||
rss-thread-watch.core.CONFIG"
|
rss-thread-watch.core.CONFIG"
|
||||||
[rqst]
|
[rqst]
|
||||||
{:status 200
|
(try (let [{{chod "chod" :or {chod 60}
|
||||||
:header {"Content-Type" "text/html"}
|
:as prms} :params
|
||||||
:body "All pony here ^:)"})
|
uri :uri} rqst
|
||||||
|
queries (if (vector? (prms "q")) (prms "q") [(prms "q")]) ; to always return vector
|
||||||
|
repeat? (prms "repeat")
|
||||||
|
real-chod (try ;If we can't parse number from give chod param, just use 94
|
||||||
|
(if (or (vector? chod)
|
||||||
|
(< (Integer/parseInt chod) 60)) ; Never accept chod lower that 60 TODO: don't hardcode this
|
||||||
|
94 (Integer/parseInt chod))
|
||||||
|
(catch Exception e
|
||||||
|
94))
|
||||||
|
cache @watcher/chod-threads-cache]
|
||||||
|
;; (println "RCVD: " rqst)
|
||||||
|
(println rqst)
|
||||||
|
;; ====== Errors =====
|
||||||
|
;; Something other than feed.xml requested
|
||||||
|
(when-not (s/ends-with? uri "feed.xml")
|
||||||
|
(throw (ex-info "404" {:status 404
|
||||||
|
:header {"Content-Type" "text/plain"}
|
||||||
|
:body "404 This server has nothing but /feed.xml"})))
|
||||||
|
;; No querry specified - don't know what to search for
|
||||||
|
(when-not (prms "q")
|
||||||
|
(throw (ex-info "400" {:status 400
|
||||||
|
:header {"Content-Type" "text/plain"}
|
||||||
|
:body (str "400 You MUST specify query with one OR more'q=searchTerm' url parameter(s)\n\n\n"
|
||||||
|
"Exmple: '/feed.xml?q=pony&q=IWTCIRD' will show in your feed all threads with 'pony' or 'IWTCIRD'"
|
||||||
|
" in their title that are about to die.")})))
|
||||||
|
;; Whether cache has been generated yet
|
||||||
|
(when (empty? cache)
|
||||||
|
(throw (ex-info "503" {:status 503
|
||||||
|
:header {"Content-Type" "text/plain"}
|
||||||
|
:body (str "503 Service Unavailable\n"
|
||||||
|
"Cache is empty, cannot generate feed. Try again later, it may work.")})))
|
||||||
|
;; ==== Everything good ====
|
||||||
|
{:status 200
|
||||||
|
;; There shouldn't be any problems with this mime type but if there are
|
||||||
|
;; replace with "text/xml", or even better, get RSS reader that is not utter shit
|
||||||
|
:header {"Content-Type" "application/rss+xml"}
|
||||||
|
:body (generate-feed queries real-chod repeat? cache)})
|
||||||
|
(catch Exception e
|
||||||
|
;; Ex-info has been crafted to match HTTP response body so we can send it
|
||||||
|
(if-let [caught (ex-data e)] ;Tam bude ale vždycky ex-data myslím, to chce čekovat jestli t obsahuje nějaký klíč (body? at nemusí být nějaký extra)
|
||||||
|
caught ;We have custom crafted error
|
||||||
|
{:status 500 ;Something else fucked up, we print what happened
|
||||||
|
:header {"Content-Type" "text/plain"}
|
||||||
|
:body (str "500 - Something fucked up while generating feed, If you decide to report it, please include url adress you used:\n"
|
||||||
|
(ex-cause e) "\n"
|
||||||
|
e)}))))
|
||||||
|
|
Loading…
Reference in a new issue