I am really intensely curious as to how well this went? Did you try it, your router melt down, and lose your life to a lynch mob of unhappy users? Or... ?Thank you for the detailed explanation. You are right! I will implement and check what impact it has on the CPU. I read your post and my my, what a gem of information it is. Ill test it out and share results as well. I have 15.2 GB memory free on the CCR so I think wont be needing to change memlimit. Ill share my findings in a few days.
I'll have to try this on an ARM router (CCR2116) to see if it can handle the Cake work better than the 1036.
Dave, I see dramatically less packet loss with cake. Cake may be dropping packets as part of traffic control, but by managing the congestion other flows drop a lot fewer.Packet loss is not a particularly good metric to use against a cake or fq_codel instance, as it uses packet loss to control congestion if RFC3168 is not enabled by the endpoints. In exchange for decreased latency, you get more packet loss. So in both fq_codel and cake you should have seen an increase in packet loss, and an improvement in network latency. Did the network "feel" better? Did videoconferencing and voip work better? are better questions to ask.
Now, if cake is actually mis-behaving due to cpu load or otherwise, and randomly inducing packet loss rather than intelligently dropping flows, then you are right to disable it.
What I try to do in a circumstance like this is to get a packet capture of a few test flows, and look at them via wireshark. It could very well be that your queues and offloads and/or applications are behaving better in this case with the fast path enabled, or there is a bug related to the offload + cake, or that the customer finds the range of latencies they currently experience acceptable.
Rb4011 isn’t dramatically faster. It’s a 32bit arm v7.i agree about limited single core performance on Tile Architecture, but in the test realized bysirbryanhe replicated the situation in a rb4011 which has much better single core performance (it has OoO A15 CPU) at a rate normal for that router doing shapping 200-300mbps
Feel fry to post a sanitized config.I've reconfigured the queue setup on the 4011 and will report back. After some more digging, I think I "overconfigured" it initially, burdening the CPU with unnecessary tasks.
/queue type add fq-codel-flows=10240 fq-codel-limit=1024 fq-codel-memlimit=320.0MiB fq-codel-quantum=300 kind=fq-codel name=fq-codel add cake-diffserv=besteffort cake-mpu=84 cake-overhead=38 cake-overhead-scheme=ethernet cake-rtt-scheme=internet kind=cake name=cake-interface /queue tree add limit-at=180M max-limit=180M name="LTU 192 " packet-mark=no-mark parent=vlan4001 queue=cake-interface add limit-at=180M max-limit=180M name="LTU 203" packet-mark=no-mark parent=vlan4002 queue=cake-interface add limit-at=180M max-limit=180M name="LTU 110" packet-mark=no-mark parent=vlan4004 queue=cake-interface add limit-at=200M max-limit=200M name="LTU 227" packet-mark=no-mark parent=vlan4003 queue=cake-interface add limit-at=150M max-limit=150M name=Airmax packet-mark=no-mark parent=vlan4010 queue=cake-interface
do You use FastTrack with this setup ?Here's an example of how I build these. This is a CCR2004 so I can get away with a bit more than on a 4011 but principals are the same. This is a very effective model. I don't use vlans per so this is IP matched
match ip and mark UL and DL packets separately via mangle
Add a queue tree with NO markers to identify network structure.
add individual queue with packet mark and shaped speed.
currently doing fq_codel for 'groups' and cake for individuals.
Screenshot 2023-03-01 at 5.04.18 PM.png
No. you cannot. fasttrack bypasses queues.do You use FastTrack with this setup ?Here's an example of how I build these. This is a CCR2004 so I can get away with a bit more than on a 4011 but principals are the same. This is a very effective model. I don't use vlans per so this is IP matched
match ip and mark UL and DL packets separately via mangle
Add a queue tree with NO markers to identify network structure.
add individual queue with packet mark and shaped speed.
currently doing fq_codel for 'groups' and cake for individuals.
Screenshot 2023-03-01 at 5.04.18 PM.png
3 ;;; cust: guests download chain=forward action=accept connection-state=established,related dst-address=192.168.100.0/24 log=no log-prefix="" 4 ;;; cust: guests upload chain=forward action=accept connection-state=established,related src-address=192.168.100.0/24 log=no log-prefix=""
Each of those VLANs goes to an AP. I have 10-25 customers per AP. The queueing is on the VLAN's egress.I don't like how queue trees behave when explicit directionality isn't set via a packet mark.
Am I extrapolating that you have a vlan per customer from this tree?
Then you need faster hardware. rb4011 can handle about a gig of cake queues as long as it's not doing NAT, if you NAT on the same box that's more like 600Mbps.that's why I'm asking
as soon as I add rules before fasttrack:to use those for simple rules, the general speed for that network drops from 800-900mbps to max 400Code:Select all3 ;;; cust: guests download chain=forward action=accept connection-state=established,related dst-address=192.168.100.0/24 log=no log-prefix="" 4 ;;; cust: guests upload chain=forward action=accept connection-state=established,related src-address=192.168.100.0/24 log=no log-prefix=""
just thinking how to make it work in best way.
I have a NAS device for which I would like to limit upload during daytime, but give full speed over night.
with rules like above, simple queue is working fine, but would have to disable them for night to give full speed. plus during daytime it's cutting download due to skipping fasttrack..
you just loose flexability because you can only really shape 'downloads' to the individual AP. What if your uploads get overwhelmed? You don't have an interface shaper that will handle that if you have more than 1 AP.Each of those VLANs goes to an AP. I have 10-25 customers per AP. The queueing is on the VLAN's egress.
My hope was to take just enough of the edge off before traffic hits the AP as opposed to letting the AP's max out.
So far, so good.
到目前为止,还没有一个问题。在将来point I'll load queueing + shapers onto customers' IP's, or better yet onto their routers on the outbound interface.you just loose flexability because you can only really shape 'downloads' to the individual AP. What if your uploads get overwhelmed? You don't have an interface shaper that will handle that if you have more than 1 AP.
Ideally you shape downloads on the head end with a big shaper tree, so you don't transport packets you will end up dropping. And you shape uploads twice, once at the customer for plan enforcement, and once at or right above the AP to keep in from getting overwhelmed. maybe also at each backhaul if you have constraints.到目前为止,还没有一个问题。在将来point I'll load queueing + shapers onto customers' IP's, or better yet onto their routers on the outbound interface.you just loose flexability because you can only really shape 'downloads' to the individual AP. What if your uploads get overwhelmed? You don't have an interface shaper that will handle that if you have more than 1 AP.
I would just implement it and go measure, and be ready to roll back. Pound it flat with artificial traffic at 4am?
(I am one of the authors of fq_codel and cake, but I do not have enough data either, on how well this stuff scales on given bits of mikrotik hardware). In most cases it is the per customer shaper that dominates the cpu by a factor of 9, and the underlying queue type be it a fifo, sfq, fq_codel or cake, adds only a tiny bit (cake is about 2.5 times as slow as fq_codel but does more, and again, it is the shaping cost that dominates). If you are low on memory, for folk running at less than a gbit, you can use memlimit 8mbyte rather than the default 32Mbyte.
On x86 gear we have individual implementations of cake scaling to 10Gbit/core. On 500mhz single core mips, cake scales to only about 80Mbit. your mileage will be somewhere between those. :/
See also.
viewtopic.php?t=179307
Certainly is, that's why you need to bench test. Further, I find that ipq4018 devices aren't stable on cake when you overrun the queue. easy to replicate by doing a bandwidth test at 2x the shaper, the device will crash. That's probably because not many people are doing this across all hardware types.Hiya.. that certainly looks like an easy comment, but when u are running on a production hardware mikrotik with over 1000 pppoe-sessions.. it does not seem the smartest thing to do.. beta-testing on a production hardware with lots of pppos-sessions runnig..
Hi. So at last i got the time to test it out. I did implement FQ_Codel using simple queue for a customer of mine on the CCR with a capping of 800 Mbps and to my disappointment, the CCR's CPU got clocked at 100%. In addition, the throughput for the customer also dropped to half of what Ive set in simple queue. Ive not tested it out on customers with less than 100 Mbps circuits yet. But it seems CCR cant handle it on high capacity circuits.I am really intensely curious as to how well this went? Did you try it, your router melt down, and lose your life to a lynch mob of unhappy users? Or... ?Thank you for the detailed explanation. You are right! I will implement and check what impact it has on the CPU. I read your post and my my, what a gem of information it is. Ill test it out and share results as well. I have 15.2 GB memory free on the CCR so I think wont be needing to change memlimit. Ill share my findings in a few days.
I assume that this is on your 1036 right? Those TILE CPUs just can't cut it. CCR2116 is THE mikrotik shaping box to get...
Hi. So at last i got the time to test it out. I did implement FQ_Codel using simple queue for a customer of mine on the CCR with a capping of 800 Mbps and to my disappointment, the CCR's CPU got clocked at 100%. In addition, the throughput for the customer also dropped to half of what Ive set in simple queue. Ive not tested it out on customers with less than 100 Mbps circuits yet. But it seems CCR cant handle it on high capacity circuits.