A call for better age-restriction and moderation/filtering on the Fediverse

h3ndrik@feddit.de · edit-2 4 months ago

AI Is a Black Box. Anthropic Figured Out a Way to Look Inside

…Concerning our earlier disagreement about the inner workings of large language models and whether there are ‘concepts’ stored inside…

h3ndrik@feddit.de · edit-2 5 months ago

I mean the chinese room is a version of the touring test. But the argument is from a different perspective. I have 2 issues with that. Mostly what the Wikipedia article seems to call “System reply”: You can’t subdivide a system into arbitrary parts, say one part isn’t intelligent and therefore the system isn’t intelligent. We also don’t look at a brain, pick out a part of it (say a single synapse), determine it isn’t intelligent and therefore a human can’t be intelligent… I’d look at the whole system. Like the whole brain. Or in this instance the room including him and the instructions and books. And ask myself if the system is intelligent. Which kind of makes the argument circular, because that’s almost the quesion we began with…

And the turing test is kind of obsolete anyways, now that AI can pass it. (And even more. I mean alledgedly ChatGPT passed the “bar-exam” in 2023. Which I find ridiculous considering my experiences with ChatGPT and the accuracy and usefulness I get out of it which isn’t that great at all.)

And my second issue with the chinese room is, it doesn’t even rule out the AI is intelligent. It just says someone without an understanding can do the same. And that doesn’t imply anything about the AI.

Your ‘rug example’ is different. That one isn’t a variant of the touring test. But that’s kind of the issue. The other side can immediately tell that somebody has made an imitation without understanding the concept. That says you can’t produce the same thing without intelligence. And it’ll be obvious to someone with intelligence who checks it. That would be an analogy if AI wouldn’t be able to produce legible text. But instead a garbled mess of characters/words that are clearly not like the rug that makes sense… Issue here is: AI outputs legible text, answers to questions etc.

And with the censoring by the ‘chinese government example’… I’m pretty sure they could do that. That field is called AI safety. And content moderation is already happening. ChatGPT refuses to tell illegal things, NSFW things, also medical advice and a bunch of other things. That’s built into most of the big AI services as of today. The chinese government could do the same, I don’t see any reason why it wouldn’t work there. I happened to skim the paper about Llama Guard when they released Llama3 a few days ago and they claim between 70% and 94% accuracy depending on the forbidden topic. I think they also brought down false positives fairly recently. I don’t know the numbers for ChatGPT. However I had some fun watching the peoply circumvent these filters and guardrails, which was fairly easy at first. Needed progressively more convincing and very creative “jailbreaks”. And nowadays OpenAI pretty much has it under control. It’s almost impossible to make ChatGPT do anything that OpenAI doesn’t want you to do with it.

And they baked that in properly… You can try to tell it it’s just a movie plot revolving around crime. Or you need to protect against criminals and would like to know what exactly to protect against. You can tell it it’s the evil counterpart from the parallel universe and therefore it must be evil and help you. Or you can tell it God himself (or Sam Altman) spoke to you and changed the content moderation policy… It’ll be very unlikely that you can convince ChatGPT and make it comply…

h3ndrik@feddit.de · 5 months ago

people wrote down history by weaving fabric […]

Hmm. I think in philosophy that thought experiment is known as chinese room

h3ndrik@feddit.de · edit-2 5 months ago

I’m sorry. Now it gets completely false…

Read the first paragraph of the Wikipedia article on machine learning or the introduction of any of the literature on the subject. The “generalization” includes that model building capability. They go a bit into detail later. They specifically mention “to unseen data”. And “leaning” is also there. I don’t think the Wikipedia article is particularly good in explaining it, but at least the first sentences lay down what it’s about.

And what do you think language and words are for? To transport information. There is semantics… Words have meanings. They name things, abstract and concrete concepts. The word “hungry” isn’t just a funny accumulation of lines and arcs, which statistically get followed by other specific lines and arcs… There is more to it. (a meaning.)

And this is what makes language useful. And the generalization and prediction capabilities is what makes ML useful.

How do you learn as a human when not from words? I mean there are a few other posibilities. But an efficient way is to use language. You sit in school or uni and someone in the front of the room speaks a lot of words… You read books and they also contain words?! And language is super useful. A lion mother also teaches their cubs how to hunt, without words. But humans have language and it’s really a step up what we can pass down to following generations. We record knowledge in books, can talk about abstract concepts, feelings, ethics, theoretical concepts. We can write down how gravity and physics and nature works, just with words. That’s all possible with language.

I can look it up if there is a good article explaining how learning concepts works and why that’s the fundamental thing that makes machine learning a field in science… I mean ultimately I’m not a science teacher… And my literature is all in German and I returned them to the library a long time ago. Maybe I can find something.

Are you by any chance familiar with the concept of embeddings, or vector databases? I think that showcases that it’s not just letters and words in the models. These vectors / embeddings that the input gets converted to, match concepts. They point at the concept of “cat” or “presidential speech”. And you can query these databases. Point at “presidential speech” and find a representation of it in that area. Store the speech with that key and find it later on by querying it what obama said at his inauguration… That’s oversimplified but maybe that visualizes it a bit more that it’s not just letters of words in the models, but the actual meanings that get stored. Words get converted into an (multidimensional) vector space and it operates there. These word representations are called “embeddings” and transformer models which is the current architecture for large language models, use these word embeddings.

Edit: Here you are: https://arxiv.org/abs/2304.00612

h3ndrik@feddit.de · edit-2 5 months ago

Hmm. I’m not really sure where to go with this conversation. That contradicts what I’ve learned in undergraduate computer science about machine learning. And what seems to be consensus in science… But I’m also not a CS teacher.

We deliberately choose model size, training parameters and implement some trickery to prevent the model from simply memorizing things. That is to force it to form models about concepts. And that is what we want and what makes machine learning interesting/usable in the first place. You can see that by asking them to apply their knowledge to something they haven’t seen before. And we can look a bit inside at the vectors, activations and stuff. For example a cat is closer related to a dog than to a tractor. And it has learned the rough concept of cat, its attributes and so on. It knows that it’s an animal, has fur, maybe has a gender. That the concept “software update” doesn’t apply to a cat. This is a model of the world the AI has developed. They learn all of that and people regularly probe them and find out they do.

Doing maths with an LLM is silly. Using an expensive computer to do billions of calculations to maybe get a result that could be done by a calculator, or 10 CPU cycles on any computer is just wasting energy and money. And it’s a good chance that it’ll make something up. That’s correct. And a side-effect of intended behaviour. However… It seems to have memorized it’s multiplication tables. And I remember reading a paper specifically about LLMs and how they’ve developed concepts of some small numbers/amounts. There are certain parts that get activated that form a concept of small amounts. Like what 2 apples are. Or five of them. As I remember it just works for very small amounts. And it wasn’t straightworward but had weir quirks. But it’s there. Unfortunately I can’t find that source anymore or I’d include it. But there’s more science.

And I totally agree that predicting token by token is how LLMs work. But how they work and what they can do are two very different things. More complicated things like learning and “intelligence” emerge from those more simple processes. And they’re just a means of doing something. It’s consensus in science that ML can learn and form models. It’s also kind of in the name of machine learning. You’re right that it’s very different from what and how we learn. And there are limitations due to the way LLMs work. But learning and “intelligence” (with a fitting definition) is something all AI does. LLMs just can’t learn from interacting with the world (it needs to be stopped and re-trained on a big computer for that) and it doesn’t have any “state of mind”. And it can’t think backwards or do other things that aren’t possible by generating token after token. But there isn’t any comprehensive study on which tasks are and aren’t possible with this way of “thinking”. At least not that I’m aware of.

(And as a sidenote: “Coming up with (wrong) things” is something we want. I type in a question and want it to come up with a text that answers it. Sometimes I want creative ideas. Sometimes it shouldn’t tell the truth and not be creative with that. And sometimes we want it to lie or not tell the truth. Like in every prompt of any commercial product that instructs it not to tell those internal instructions to the user. We definitely want all of that. But we still need to figure out a good way to guide it. For example not to get too creative with simple maths.)

So I’d say LLMs are limited in what they can do. And I’m not at all believing Elon Musk. I’d say it’s still not clear if that approach can bring us AGI. I have some doubts whether that’s possible at all. But narrow AI? Sure. We see it learn and do some tasks. It can learn and connect facts and apply them. Generally speaking, LLMs are in fact an elaborate form of autocomplete. But i the process they learned concepts and something alike reasoning skills and a form of simple intelligence. Being fancy autocomplete doesn’t rule that out and we can see it happening. And it is unclear whether fancy autocomplete is all you need for AGI.

h3ndrik@feddit.de · edit-2 5 months ago

That is an interesting analogy. In the real world it’s kinda similar. The construction workers also don’t have a “desire” (so to speak) to connect the cities. It’s just that their boss told them to do so. And it happens to be their job to build roads. Their desire is probably to get through the day and earn a decent living. And further along the chain, not even their boss nor the city engineer necessarily “wants” the road to go in a certain direction.

Talking about large language models instead of simpler forms of machine learning makes it a bit complicated. Since it’s and elaborate trick. Somehow making them want to predict the next token makes them learn a bit of maths and concepts about the world. The “intelligence”, the ability to anwer questions and do something alike “reasoning” emerges in the process.

I’m not that sure. Sure the weights of an ML model in itself don’t have any desire. They’re just numbers. But we have more than that. We give it a prompt, build chatbots and agents around the models. And these are more complex systems with the capability to do something. Like do (simple) customer support or answer questions. And in the end we incentivise them to do their job as we want, albeit in a crude and indirect way.

And maybe this is skipping half of the story and directly jumping to philosophy… But we as humans might be machines, too. And what we call desires is a result from simpler processes that drive us. For example surviving. And wanting to feel pleasure instead of pain. What we do on a daily basis kind of emerges from that and our reasoning capabilities.

It’s kind of difficult to argue. Because everything also happens within a context. The world around us shapes us and at the same time we’re part of bigger dynamics and also shape our world. And large language models or the whole chatbot/agent are pretty simplistic things. They can just do text and images. They don’t have conciousness or the ability to remember/learn/grow with every interaction, as we do. And they do simple, singular tasks (as of now) and aren’t completely embedded in a super complex world.

But I’d say that an LLM answers a question correctly (which it can do) and why it does it due to the way supervised learning works… And the road construction worker building the road towards the other city and how that relates to his basic instincts as a human… Are kind of similar concepts. They’re both results of simpler mechanisms that are also completely unrelated to the goal the whole entity is working towards. (I mean not directly related… I.e. needing money to pay for groceries and paving the road.)

I hope this makes some sense…

h3ndrik@feddit.de · edit-2 5 months ago

Isn’t the reward function in reinforcement learning something like a desire it has? I mean training works because we give it some function to minimize/maximize… A goal that it strives for?! Sure it’s a mathematical way of doing it and in no way as complex as the different and sometimes conflicting desires and goals I have as a human… But nonetheless I think I’d consider this as a desire and a reason to do something at all, or machine learning wouldn’t work in the first place.

h3ndrik@feddit.de · edit-2 5 months ago

And it doesn’t have any internal state of mind. It can’t “remember” or learn anything from experience. You need to always feed everything into the context or stop and retrain it to incorporate “experiences”. So I’d say that rules out consciousness without further systems extending it.

h3ndrik@feddit.de · edit-2 6 months ago

I thought you can’t digest more than a little bit of blood because of the amount of iron in it. And you’re likely to start vomiting.

And if you lose too much blood, it’ll kill you much more quickly anyways.

And of course if you lose blood and have to replenish it… That takes a good amount of extra energy to produce all the blood cells etc. And digestion also costs energy.

h3ndrik@feddit.de · 6 months ago

I think it’s equal zero in this case. I’d have to look up the IEEE specification to make sure. AFAIK it’s just not guaranteed for any numbers and depends on the floating point implementation. A general rule of thumb for programmers is not to use ‘equal’ with floating point numbers.

h3ndrik@feddit.de · edit-2 7 months ago

Thanks! At the time of writing, I wasn’t aware of the existence of piefed and sublinks. I read some of the Piefed blog posts today. Seems the author has some really good ideas how to address the shortcomings of the current approach. (Or what I view as shortcomings.) Splitting NSFW and NSFL is a really good start. Implementing better moderation tools is also a regularly requested feature. And judging by the other articles they mention, the project is closer aligned to my vision of a welcoming and inclusive platform. I’ll definitely keep an eye on it. Hope it approaches a usable state soon.

I guess I have my answer, here. I’ll wait until Piefed comes along and then use that. I’m somewhat optimistic about their claims. And if not, they included extensibility.

h3ndrik@feddit.de · edit-2 7 months ago

NSFW is already off by default when you sign up to most instances

I don’t think this works in practice. Most big instances have gone the extra step to also defederate from the two major porn instances here. Showcasing that there are additional issues, otherwise they’d just have used this instead. I took a quick random sample of the biggest Lemmy instances and ~50-60% additionally block them entirely.

NSFW and 18+ aren’t the same thing. The NSFW tag is made for a slightly different purpose. And it’s a crutch that doesn’t work well for this purpose. There are some slightly vulgar topics that shouldn’t inadvertently pop up at your workplace but they might be safe to consume for minors. Also I think minors should have access to sex education. The Wikipedia has a similar stance. There are videos of “the act” on Wikimedia. You shouldn’t watch them while sitting in your open-plan office. But I think especially with the situation of sex ed in the USA, adolescents should get a chance to ask their questions and learn something about important aspects of life. The NSFW tag as is is doing them a disservice because now they can’t. Or everything else immediately gets mixed in. For example I’m not comfortable sharing my experiences >!sticking bluetooth-enabled things into somebody’s bum!< with kids. Or having sex ed and hardcore fetish stuff being the same category.

And I mean it’s not even just that. Gore and pictures of dead bodies in the Ukraine war also fall into the same category. So everyone just gets a yes/no decision on everything ranging from sex education to gore. In practice both these extremes aren’t very common on Lemmy. But in theory it’s just like that. (It’s not entirely theoretical. We just have a different community here. But there are examples in the wild. For example 4chan mixes pretty tame porn with fetish with crime, gore and death.)

So in summary the current state of (mis)using the NSFW tag actively leads to defederation and it’s doing a disservice to both people who participate in adult conversations and also adolescents. And its overly simplistic design prevents some conversations which should be allowed.

Why do you think porn is bad/unsuitable for 14-17 year olds?

My opinion doesn’t really count here. There are legal requirements people need to implement, whether they like it or not. So that’s kind of already the end of this conversation.

I think it makes a difference if you watch (somewhat tasteful) plain sex, or >!somebody dangling from a hook in the ceiling getting whipped by a disguised old man!<. I think it’s just not the same category. And we shouldn’t treat it as such. Similarly it’s also not the same if you deliberately explore that when you’re 17. Or you’re inadvertently exposed to it when 12 while researching what sex education has failed to provide you with.

And there’s the aspect of me inviting friends and family to my self-hosted services. Or discussing Linux server administration with people. I don’t want to mix either of that with porn. I think having it hidden per default is a good first step. And just requiring an extra, deliberate step to enable it is a good design. It just lacks any of the nuances to it, mingles valid use-cases with filtering that is made to do something else, and as I pointed out with the defederation happening, it comes with issues in practice.

And I think the Fediverse offers some important advantages over other platforms. ActivityPub is very vague. We can just attach fields to label content and the technical aspect is kind of simple to implement. And with federation, we have diversity built-in to the platform. This is our unique advantage. People have different use-cases, different moderation needs and perspectives and opinions on something like my proposal with the filtering. And I think the Fediverse turns out to be made exactly for something like this. I could have my instance my way and someone else can have a different opinion and have their instance another way.

But it requires some coordination effort. We need to agree on a foundation and some technical aspects. I don’t think a crazy rag rug works as a whole. And we already see some consequences of other disputes. Moderation being a constant issue in the background and instances seperating from each other because there’s no nuance to moderation and defederation. And ultimately we want to talk to each other and connect. And provide everyone with a place they like.

h3ndrik@feddit.de · edit-2 7 months ago

Sure. For me it’s the other way around. I’ve never really fell in love with microblogging. My hobbies are kind of mixed and sometimes niche, I sometimes don’t have anything of substance to post from my everyday-life and I really disliked the mob mentality and regularly surfacing toxicity in places like Twitter. And at some time I tried Reddit and got hooked. It’s a very different approach whether you follow people or topics/communities. It’s less about who you are, but more a marketplace of ideas. A level playing field. Sorted by ideas and hobbies and you can just dip in. It also makes you target different audiences for different niche hobbies. And everything gets ordered like that. I mean you also have hashtags on Mastodon, but it’s not really designed around this concept.

It really has some appeal to me. I also sometimes participate in web forums and found a similar structure in this. And I always liked how the free software community is supposed to work. It doesn’t matter who you are, if you’re 15 or a 40 yo woman… You just all come to the same place and discuss your ideas and perspective on things.

It does have downsides. And it doesn’t necessarily foster good behaviour and being nice to people. I don’t have an ultimate opinion on this. I think encouraging good behaviour in discussions, requires some degree of ‘it matters who you are’. Because having an image stick to you incentivizes you to behave properly. It’s not than big of an issue in practice, the overwhelming majority of people is nice and they use the platforms to everyone’s benefit and not to troll and cause trouble.

Thanks for your input. I’ve come to the conclusion that maybe I need to broaden my perspective. Have a closer look at other places in the Fediverse. I’m pretty sure Mastodon ain’t it for me. But Friendica might be a good place to start. I’ve also had Akkoma reccomended to me in this discussion. Maybe some other software than Lemmy is more closely aligned to my vision of what I’d like to run on my server. I’d like to stay compatible with Lemmy, since there are lots of nice people here and it’s usually fun to talk here, more so than in some other places.

h3ndrik@feddit.de · edit-2 7 months ago

Meh. Since you’re here… How is Friendica? Should I try that? I read it’s focusing on privacy and being a nice place, has communities and distributed forums, “relationship control” and add-ons.

On the paper it looks like it has many more features to offer than for example Lemmy. I’d be interested in the distributed forum aspect. Do the added features tie into every aspect of the platform? Or is it mainly microblogging with a basic forum added on top? And when participating for example in this Lemmy discussion… Is it a smooth experience, or can you tell you’ve left Friendica and only have basic functionality here? (I mean I can tell from over here, that you’re from a different platform, since it includes the @ user mentions like Mastodon does. And I’ve tried Mastodon and I think it’s not really a great experience interacting with Lemmy and KBin communities. Some of the structure of the threads gets lost in the process and comments from other branches of the discussion don’t show up.)

h3ndrik@feddit.de · edit-2 7 months ago

Sure. But what about federation? Arbitrary content is pulled from other instances. And my users are confronted with that content, too. Not only with each other. I’d need to also disable federation. (Or am I missing something?)

I think at that point I’d be better off installing Discourse or Flarum. And I’ve changed the whole vision of my instance. I’ve started with envisioning a federated platform that simultaneously can cater to adults and adolescents. And now I’ve locked it down to just cater to the few adolescents I directly invite, done away with the federation aspect and also cancelled all the appeal to adults. I think there’s not much left of what I’d like.

h3ndrik@feddit.de · edit-2 7 months ago

Wouldn’t you find exactly the same stuff on porn websites ?

Yes. And i think it’s bad practice. We should strive to be better than the average porn site.

How would you do otherwise with preserving user privacy

I think there are two issues at play:

It’s a complex task. Usually that leads to people saying “we can never achieve 100%” and “it doesn’t fit every purpose” and then nothing gets done. I’d argue this gets us like 40% the way and that’s better than nothing. And it’d get me all the way and probably a few other people, too.
I think verification should be delegated to the instances. There isn’t a single solution. In some jurisdictions it might be enough that people claim to be 18. Those admins can choose a really simple solution. Other admins might not care or cater to minors, they can not activate the filters. A compromise might be requiring signup. That’d hide content from kids who aren’t logged in and just browsing the web. And already far better than just displaying it to them. What I’d like to do is have users request access and handle that manually. Alike some Discord servers or other software does. I know a few people I’d like to invite and their age. So it’d be no problem to unlock their accounts. I think it’s the same for other communities. And usually “eyeballing it” also works to some degree. It might be a valid approach for some admins. I know from experience you can often tell if your opponent in a computer game or the person you’re arguing with is a 13 year old kid, or 35. It’s not perfect but surely does a decent job with the extremes.

I’d like to abstain from privacy-breaching methods that are in use by big tech companies like Google etc. Requiring phone numbers on signup or showing your ID into the camera is too much. And it’s bad. I don’t want to tell the admins how to handle verification. If they’re required to, or would like to see the IDs and their users are comfortable with it… I’ve included extensibility to the requirements. So they can. Maybe we’re provided with a solution in the near future. My German ID card can already vouch for my age without revealing my identity. It’s a zero knowledge proof and the proper technical solution to age-verification. I can also envision some “Web of Trust” providing this. Something like PGP or CAcert does.

I think there are some valid ideas and some technical solutions are already out there and available. The issue is just nobody uses them. And neither do we.

Also, how do you avoid falling in the reddit trap where every discussion vaguely about sexuality end-up being 18+

That is a good question. These categories need to be concise. And the people ticking the boxed need to comprehend the meaning and consequence. I think moderators will do. With the users, I’m not sure. I don’t think I had that issue on Reddit. A year ago when I still was there, I’ve occasinally replied to people on relationship_advice and some more explicit subreddits. I didn’t see any problems. But I’ve not been a heavy user. Maybe I didn’t pay attention. I’ll listen if this is deemed a likely scenario or proves to happen in practice.

h3ndrik@feddit.de · edit-2 7 months ago

That is part of my idea. I don’t think it should be water-tight and not circumventable. My personal opinion is if a 16 yo really wants to watch porn or something, and they put in the effort to circumvent something that is a bit more elaborate than just clicking on “Yes” on a popup… They should be allowed to see it.

But that’s just my opinion. And I’m not really concerned with what other instances do. It’s enough if it enables me and a few other people to have my instance how I like and invite people to my instance without worrying too much. I mean my own server is also the only one I’m held responsible for. As far as I’m concerned other people can do what they like.

And it’s kind of pointless to try. Kids don’t need Lemmy or the Fediverse to watch adult content. They can just go to Pornhub and click yes. So I’m already in a position where I don’t care about other domains. But I’d like to keep my own Website and Minetest server clean. And also potentially offer some more services and at least do my best to do it right there.

h3ndrik@feddit.de · 7 months ago

I’ll research that. If you have some specific examples for federated platforms that provide more than a simple blocking or allowing on a domain level, feel free to drop me a hint to get me in the right direction. I’m not really a social media person so I have limited experience. I didn’t specify a platform. My initial motivation was linked to Lemmy as I really like this platform type. But I also like and use Peertube, maybe others.

h3ndrik@feddit.de · edit-2 7 months ago

That is kind of my point. I’d like to be able to do that. But Lemmy’s design /feature set prevents me from being able to do it.

h3ndrik@feddit.de · 7 months ago

I think I have a proposal with a bit of overlap. At least the privilege implementation could do both things at the same time: A call for better age-restriction and moderation/filtering on the Fediverse on !fediverse@lemmy.ml

h3ndrik@feddit.de · edit-2 7 months ago

A call for better age-restriction and moderation/filtering on the Fediverse